Java Collection Framework – Part 3

Dewan Fazle Ayaz

1 January, 2025

In the previous sections, we discussed the sub-interfaces of the Collection interface and their implementation classes, including the internal mechanisms and use cases. If you missed it, please find it here.

This section will discuss the Map interface, its implementation classes, and how they work internally.

Map

Map Maps keys to values with no duplicate keys. Each key maps to at most one value. This interface replaces the old Dictionary class, offering more modern and flexible functionality (it uses Generic for flexibility and ensures type safety, which was a significant issue in Dictionary).

Before going into details about the implementation classes, it will be helpful to understand the following terminologies:

Load Factor: The load factor, defined as the ratio of entries to the current capacity, determines when to resize the Hashtable. It is typically set to 0.75 by default.

load factor =total number of elements (in all buckets) / number of buckets 

Initial Capacity: The initial capacity is the number of buckets in the Hashtable when it is first created, which affects performance and the number of collisions.

Hash Collision: A hash collision occurs when two different keys produce the same hash code and are mapped to the same bucket in the Hashtable.

Now, let’s discuss the implementation classes of the Map interface.

Implementation Classes

a) HashTable: It is a legacy class and part of Java’s original version. 

Internal Mechanism

  • This class internally implements a hash table data structure. Any non-null object can be used as a key or as a value. It uses a hash function to compute an index in an array of linked lists (buckets). It handles collisions by chaining entries in the same bucket and dynamically resizes the underlying array when the load factor exceeds a threshold. Hashtable is synchronized for thread safety, ensuring data consistency in multi-threaded environments. It uses Enumeration (it works on Object type, while Iterator works on generic type) for iteration.

b) HashMap: It is more versatile and widely used in modern Java applications.

Internal Mechanism

  • HashMap is also a hashtable-based implementation of the Map interface. It is generally preferred over Hashtable for its performance, flexibility with null values, and modern features like tree-based collision resolution. 
  • It is not synchronized by default, making it more efficient for single-threaded applications. If the number of entries in a bucket exceeds a certain threshold (8 by default), it converts the linked list into a red-black tree to optimize performance. This significantly improves lookup times for buckets with many collisions compared to Hashtable. It uses an iterator (works on generic type) for iteration, which is fail-fast.
  • When developing new applications, it’s advisable to use HashMap or ConcurrentHashMap for thread-safe needs.
Why should we use immutable objects as keys?

Using a mutable object as a key in a Hash Map can lead to issues because the hash code and key equality might change if the object is modified. This can result in unpredictable behavior, such as an inability to retrieve the value associated with the key.

Here’s an example code that demonstrates this pitfall:

				
					import java.util.HashMap;
import java.util.Map;

class MutableKey {
    private int key;
    public MutableKey(int key) {
        this.key = key;
    }
    public int getKey() {
        return key;
    }
    public void setKey(int key) {
        this.key = key;
    }

    @Override
    public int hashCode() {
        return Integer.hashCode(key);
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj) return true;
        if (obj == null || getClass() != obj.getClass()) return false;
        MutableKey that = (MutableKey) obj;
        return key == that.key;
    }
}

public class HashMapMutableKeyExample {
    public static void main(String[] args) {
        Map<MutableKey, String> map = new HashMap<>();
        MutableKey key = new MutableKey(1);
        map.put(key, "Initial Value");

        // Print the value associated with the key
        System.out.println("Value before modification: " + map.get(key));

        // Modify the mutable key
        key.setKey(2);

        // Attempt to retrieve the value using the modified key
        System.out.println("Value after modification: " + map.get(key));
    }
}
				
			

Output:

				
					Value before modification: Initial Value
Value after modification: null
				
			

c) TreeMap: A TreeMap is a Red-Black tree-based implementation of the NavigableMap interface, which sorts keys based on their natural ordering or a provided comparator.  It also supports additional navigational features (you can move to higher, lower, and other values like navigation) from the NavigableMap interface, making it useful for ordered data retrieval. Note that this implementation is not synchronized.

Internal Mechanism

  • It maintains balanced key-value pairs. Each node contains a key, value, pointers to child nodes, and a color attribute to uphold Red-Black Tree properties. The color property balances the tree.

Performance
It ensures O(log n) time complexity for insertion, deletion, and lookup operations.

Iterable

Iterable is a generic interface in the java.lang package that represents a collection of elements that can be traversed sequentially.

The collection interface extends the Iterable interface, meaning all its implementation classes have implemented it. Therefore, we can iterate over all collection interfaces and classes using Iterable.

Benefits of Iterable

Standard Iteration: A consistent way to iterate over elements using an Iterator.

				
					 // Create and populate an ArrayList
        List<String> list = new ArrayList<>();
        list.add("Apple");
        list.add("Banana");
        list.add("Cherry");

        // Obtain an Iterator from the ArrayList
        Iterator<String> iterator = list.iterator();

        // Iterate through the ArrayList using the Iterator
        while (iterator.hasNext()) {
            // Get the next element
            String item = iterator.next();
            // Print the element
            System.out.println(item);
        }
				
			

Enhanced For-Loop: Enables use in enhanced for-loops (for each loop) for cleaner, more readable code.

				
					List<String> list = Arrays.asList("a", "b", "c");
for (String item : list) {
    System.out.println(item);
}
				
			

Integration with Java Streams: Allows collections to be used with Java Streams for functional-style operations.

				
					List<String> list = Arrays.asList("a", "b", "c");
list.stream().forEach(System.out::println);
				
			

Custom Collections: Facilitates custom implementations with standard iteration support.

Example: Creating a custom collection that implements Iterable for backward iteration:

				
					import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

public class CustomCollectionWithBackwardIterable implements Iterable<String> {
    private final List<String> list;

    public CustomCollectionWithBackwardIterable(List<String> list) {
        this.list = list;
    }

    @Override
    public Iterator<String> iterator() {
        return new Iterator<String>() {
            private int index = list.size() - 1;

            @Override
            public boolean hasNext() {
                return index >= 0;
            }

            @Override
            public String next() {
                if (index < 0) {
                    throw new IndexOutOfBoundsException("No more elements");
                }
                return list.get(index--);
            }
        };
    }

    public static void main(String[] args) {
        // Create and populate an ArrayList
        List<String> list = new ArrayList<>();
        list.add("Apple");
        list.add("Banana");
        list.add("Cherry");

        // Use custom Iterable for backward iteration
        CustomCollectionWithBackwardIterable backwardIterable = new CustomCollectionWithBackwardIterable(list);
        
        for (String item : backwardIterable) {
            System.out.println(item);
        }
    }
}
				
			

Output:

				
					Cherry
Banana
Apple
				
			
Why did Java introduce Iterable, although it internally returns an Iterator?

Java introduced the Iterable interface, even though it internally returns an Iterator, to provide a higher level of abstraction (isolating the iterator implementation) that makes it easier and more intuitive to work with collections and other data structures that can be iterated over. Here’s why the Iterable interface was introduced:

1. Simplifying the API Design

Before Iterable was introduced, you needed to explicitly call the iterator() method on a collection and manage the iteration manually to iterate over the collection.

Now, the Iterable interface standardizes the concept of iteration. You can iterate over its elements using the enhanced for-loop without explicitly obtaining an Iterator by making a collection implement Iterable. This makes the code cleaner and more concise:

				
					List<String> list = Arrays.asList("A", "B", "C");
for (String item : list) {
   System.out.println(item);
}
				
			

2. Separation of Concerns

By introducing an Iterable, Java separates the responsibility of creating an Iterator from the responsibility of iterating over elements. An iterable is concerned with providing an Iterator, while an Iterator is concerned with the mechanics of iteration.

What if Java directly implemented an Iterator instead of an Iterable for collection?

If Java had designed its collection classes to implement the Iterator interface instead of the Iterable interface directly, it would have led to several issues and limitations:

1. Single Iteration Point

Collections often need to support multiple types of iteration (e.g., forward, backward, etc.). If a collection class directly implements an Iterator, the collection itself must manage the iteration state. Thus, it would be challenging to provide different iterators for different iteration strategies (like reverse iteration) without bloating the collection class with multiple iteration methods (you need to add logic inside the hasNext() and next() methods. Moreover, there must be coordination, which might be challenging to manage). 

The Iterable-Iterator separation allows for flexible iteration strategies by returning different Iterator instances depending on the use cases.

2. Violation of SRP
Implementing an Iterator would force the collection to handle both the responsibilities of data storage and iteration management (Using Iterable, we don’t maintain the iteration by ourselves; instead, we just attach an iterator to it).

This violates the Single Responsibility Principle. Thus, the class would become more complex and more challenging to maintain.

3. Breakdown of Enhanced For-Loop Functionality

The enhanced for-loop in Java is designed to work with objects that implement the Iterable interface. If collections directly implemented the Iterator, the for-each loop would not work as intended. 

Thus, developers would lose their simplicity and readability. They would need to write manual loop constructs, making code more verbose and error-prone.

4. Reduced Clarity
Developers might understand that the collection itself is an iterator, which could lead to confusion. By keeping iteration logic in a separate Iterator object, Java allows developers to clearly distinguish between the collection and the iterator.

Why does Java throw a ConcurrentModificationException when removing an element inside the for-each loop?

This fail-fast behavior is designed to prevent subtle and hard-to-debug issues that could arise if a collection is modified during iteration. If modifications were allowed without checks, it could lead to inconsistent states or unpredictable behavior, such as skipping elements or iterating over the same element multiple times.

By throwing an exception, Java enforces that modifying the collection during iteration should be controlled (using an iterator).

				
					List<String> items = new ArrayList<>(Arrays.asList("A", "B", "C"));

for (String item : items) {
    if ("B".equals(item)) {
        items.remove(item);  // Will throw ConcurrentModificationException
    }
}
				
			
The Correct Way to Remove Elements During Iteration

To safely remove elements from a collection during iteration, you should use the Iterator’s remove() method:

				
					List<String> list = new ArrayList<>(Arrays.asList("A", "B", "C"));
Iterator<String> iterator = list.iterator();
while (iterator.hasNext()) {
    String item = iterator.next();
    if ("B".equals(item)) {
        iterator.remove();  // Safe removal
    }
}
				
			

The remove() method of the Iterator is designed to safely modify the collection during iteration by ensuring the modCount (used to track whether there has been any modification) is updated in a way that the iterator is aware of. Thus, it prevents the exception.

Conclusion

This concludes the third part of the Java Collection Framework. Here, we have discussed the Map interface with implementation classes, including the internal mechanisms and use cases.

The next section will discuss concurrent collections, collections utilities (algorithms), best practices for using the Java Collection Framework, and how to avoid common pitfalls.

Dewan Fazle Ayaz

1 January, 2025

Continue reading

Picture of Dewan Fazle Ayaz
Dewan Fazle Ayaz

1 January, 2025