Understanding the Internals of HashMap in Java
The HashMap class in Java is a widely used data structure that provides efficient storage and retrieval of key-value pairs. It is based on the concept of a hash table, which allows for constant-time average-case performance for insertion, deletion, and retrieval operations. In this article, we’ll explore the internal workings of HashMap and understand how it manages to achieve its efficiency.
Hashing and Buckets :
At the heart of the HashMap implementation is an array called the “bucket” array. This array is essentially an array of linked lists, where each element in the array stores a linked list of key-value pairs that have hashed to the same index in the array. The goal of this design is to handle collisions when different keys map to the same index.
When a key-value pair is added to the HashMap, the key is hashed to determine its index in the bucket array. The hash code is an integer value that is calculated based on the key’s contents using the hashCode()
method of the key object. The resulting hash code is then modified using a hash function to distribute the keys more evenly across the array. The modified hash code is used as the index to store the key-value pair in the bucket array.
Handling Collisions with Linked Lists
Collisions occur when two or more keys hash to the same index in the bucket array. To handle collisions, HashMap uses a linked list at each index to store the key-value pairs that have collided. When a collision occurs, the new key-value pair is simply added to the linked list at the corresponding index. This allows multiple key-value pairs to coexist at the same index in the array.
However, as the number of key-value pairs grows, the linked lists can become long, resulting in degraded performance. To address this issue, Java 8 introduced a mechanism called “bucket tree” or “balanced tree” for HashMap. When the number of key-value pairs in a linked list exceeds a certain threshold, the linked list is converted into a self-balancing binary search tree (specifically, a red-black tree) to improve search performance. This optimization ensures that even in the worst-case scenario, the time complexity for operations remains logarithmic.
Ensuring Key Uniqueness
HashMap enforces the uniqueness of keys by comparing the hash codes and the equality of the keys. When a new key-value pair is added to the HashMap, it first checks if the key already exists. If the hash code of the new key matches the hash code of an existing key at the same index in the bucket array, HashMap proceeds to compare the actual keys using the equals()
method. If the keys are equal, the new value replaces the old value. Otherwise, a new entry is added to the linked list (or tree) at that index.
Resizing and Load Factor
HashMap has a load factor, which determines when to resize the bucket array to maintain performance. The load factor is a measure of how full the HashMap can be before resizing occurs. By default, the load factor is set to 0.75, which means that when the HashMap is 75% full, it will resize the bucket array.
Resizing involves creating a new bucket array with a larger size and rehashing all the existing key-value pairs into the new array. This process redistributes the key-value pairs across the new array, reducing the chance of collisions and improving performance.
Resizing is a relatively expensive operation, as it requires iterating over all the key-value pairs in the existing array and rehashing them. However, it is necessary to maintain the expected constant-time performance. It’s important to choose an initial capacity that is sufficient for the anticipated number of
key-value pairs to avoid frequent resizing.
Example Internals of HashMap In Java
Sure! I can explain the internals of a HashMap in Java and provide you with an example program to illustrate its usage.
Internally, a HashMap in Java is implemented as an array of linked lists. The array is called the “bucket” array, and each element in the array is a linked list that stores key-value pairs. The purpose of this design is to handle collisions when different keys map to the same index in the array.
Here’s an example program that demonstrates the usage of a HashMap in Java:
import java.util.HashMap; public class HashMapExample { public static void main(String[] args) { // Create a HashMap HashMap<String, Integer> hashMap = new HashMap<>(); // Add key-value pairs to the HashMap hashMap.put("John", 25); hashMap.put("Jane", 30); hashMap.put("Alice", 20); // Access values using keys int johnAge = hashMap.get("John"); System.out.println("John's age: " + johnAge); // Update the value for a key hashMap.put("Jane", 32); // Remove a key-value pair hashMap.remove("Alice"); // Check if a key exists in the HashMap boolean containsKey = hashMap.containsKey("Alice"); System.out.println("Contains key 'Alice'? " + containsKey); // Iterate over the key-value pairs for (HashMap.Entry<String, Integer> entry : hashMap.entrySet()) { String key = entry.getKey(); int value = entry.getValue(); System.out.println(key + ": " + value); } } }
In this example, we first create a HashMap called hashMap
that stores key-value pairs. The keys are of type String
, and the values are of type Integer
.
We then add three key-value pairs using the put
method. Each key is associated with a value. Later, we retrieve the value for the key “John” using the get
method and print it.
Next, we update the value for the key “Jane” using the put
method again. We also remove the key “Alice” using the remove
method.
To check if a key exists in the HashMap, we use the containsKey
method and print the result.
Finally, we iterate over the key-value pairs in the HashMap using the entrySet
method, which returns a set of key-value pairs. We retrieve each key and value using the getKey
and getValue
methods, respectively, and print them.
Conclusion
In summary, a HashMap in Java is implemented as an array of linked lists (and balanced trees in Java 8+). It uses hashing and the concept of buckets to efficiently store and retrieve key-value pairs. Collisions are handled by storing multiple key-value pairs in the same index using linked lists, and for larger linked lists, a balanced tree is used. HashMap ensures key uniqueness by comparing hash codes and key equality. Resizing and load factor ensure performance remains constant-time by resizing the bucket array when necessary.
Understanding the internal workings of HashMap allows developers to make informed decisions when using this data structure. It is a versatile and powerful tool for storing and retrieving data efficiently in a wide range of applications.
Happy Learning..