6.1 Collection Fundamentals

Modern Java's Collection Framework provides powerful abstractions for storing, organizing, and manipulating groups of objects. Understanding the core interfaces and their modern implementations is essential for writing efficient, maintainable code.

The Collection Hierarchy

The Java Collections Framework is built on a hierarchy of interfaces:

Collection (interface)
├── List (ordered, allows duplicates)
│   ├── ArrayList (dynamic array)
│   ├── LinkedList (doubly-linked list)
│   └── CopyOnWriteArrayList (thread-safe)
├── Set (no duplicates)
│   ├── HashSet (hash table)
│   ├── LinkedHashSet (insertion-ordered hash set)
│   └── TreeSet (sorted set)
└── Queue (FIFO operations)
    ├── LinkedList (also implements Queue)
    ├── PriorityQueue (heap-based priority queue)
    └── ArrayDeque (resizable array deque)

Map (interface, not a Collection)
├── HashMap (hash table)
├── LinkedHashMap (insertion-ordered)
├── TreeMap (sorted by keys)
└── ConcurrentHashMap (thread-safe)

Choosing the Right Collection

Use List when:

  • Order matters
  • Duplicates are allowed
  • You need indexed access
// Fast random access, frequent reads
List<String> tags = new ArrayList<>(List.of("java", "streams", "collections"));

// Frequent insertions/deletions at ends
Deque<Task> taskQueue = new LinkedList<>();
taskQueue.addFirst(new Task("urgent"));
taskQueue.addLast(new Task("routine"));

Use Set when:

  • Uniqueness is required
  • Order doesn't matter (or use LinkedHashSet/TreeSet)
  • Fast membership testing is needed
// Fast lookup, no order guarantee
Set<String> uniqueEmails = new HashSet<>();
uniqueEmails.add("user@example.com");

// Maintains insertion order
Set<String> orderedTags = new LinkedHashSet<>();
orderedTags.add("first");
orderedTags.add("second");

// Sorted order
Set<Integer> sortedScores = new TreeSet<>();
sortedScores.add(95);
sortedScores.add(87);
// Iterates in ascending order: 87, 95

Use Map when:

  • Key-value associations are needed
  • Fast lookup by key is required
// Fast key-based access
Map<String, User> userCache = new HashMap<>();
userCache.put("user123", new User("Alice"));

// Maintains insertion order
Map<String, Integer> orderedCounts = new LinkedHashMap<>();

// Sorted by keys
Map<LocalDate, BigDecimal> sortedSales = new TreeMap<>();

Immutable Collections (Java 9+)

Modern Java provides factory methods for creating immutable collections:

// Immutable List
List<String> languages = List.of("Java", "Kotlin", "Scala");
// languages.add("Groovy"); // throws UnsupportedOperationException

// Immutable Set
Set<Integer> primes = Set.of(2, 3, 5, 7, 11);

// Immutable Map
Map<String, Integer> ages = Map.of(
    "Alice", 30,
    "Bob", 25,
    "Charlie", 35
);

// For more than 10 entries, use ofEntries
Map<String, String> config = Map.ofEntries(
    Map.entry("host", "localhost"),
    Map.entry("port", "8080"),
    Map.entry("timeout", "30"),
    Map.entry("retries", "3")
);

Benefits of Immutable Collections

  1. Thread Safety: No synchronization needed
  2. Predictability: Cannot be modified accidentally
  3. Performance: Internal optimizations possible
  4. Memory Efficiency: Compact representations
record Product(String id, String name, BigDecimal price) {}

class ProductCatalog {
    private final List<Product> products;

    public ProductCatalog(List<Product> products) {
        // Defensive copy - protects internal state
        this.products = List.copyOf(products);
    }

    public List<Product> getProducts() {
        // Safe to return - immutable
        return products;
    }
}

Creating Copies

Use copyOf methods to create immutable snapshots:

// Snapshot a mutable collection
List<String> mutableList = new ArrayList<>();
mutableList.add("item1");
List<String> snapshot = List.copyOf(mutableList);

// snapshot is immune to subsequent changes
mutableList.add("item2");
System.out.println(snapshot.size()); // 1

// If input is already immutable, copyOf returns it directly
List<String> immutable = List.of("a", "b");
List<String> copy = List.copyOf(immutable);
System.out.println(immutable == copy); // true (same instance)

Collection Views vs Copies

Understanding the difference between views and copies is crucial:

List<Integer> original = new ArrayList<>(List.of(1, 2, 3, 4, 5));

// View - reflects changes to original
List<Integer> sublist = original.subList(1, 4); // [2, 3, 4]
original.set(2, 99);
System.out.println(sublist); // [2, 99, 4] - reflects change

// Copy - independent
List<Integer> copy = new ArrayList<>(original);
original.set(0, 100);
System.out.println(copy.get(0)); // 1 - unchanged

Performance Characteristics

Operation ArrayList LinkedList HashSet TreeSet HashMap TreeMap
Add O(1)* O(1) O(1) O(log n) O(1) O(log n)
Remove O(n) O(1)** O(1) O(log n) O(1) O(log n)
Get O(1) O(n) N/A N/A O(1) O(log n)
Contains O(n) O(n) O(1) O(log n) O(1) O(log n)
Iterate O(n) O(n) O(n) O(n) O(n) O(n)

Amortized O(1), occasional O(n) when resizing

*
O(1) when removing from head/tail with iterator

Real-World Example: Event Aggregator

import java.time.*;
import java.util.*;

record Event(String type, LocalDateTime timestamp, String userId) {}

class EventAggregator {
    // Use appropriate collections for different access patterns
    private final List<Event> chronologicalEvents = new ArrayList<>();
    private final Map<String, List<Event>> eventsByUser = new HashMap<>();
    private final Map<String, Integer> eventTypeCounts = new HashMap<>();

    public void record(Event event) {
        // Maintain chronological list
        chronologicalEvents.add(event);

        // Index by user
        eventsByUser.computeIfAbsent(event.userId(), k -> new ArrayList<>())
                    .add(event);

        // Count by type
        eventTypeCounts.merge(event.type(), 1, Integer::sum);
    }

    public List<Event> getEventsForUser(String userId) {
        // Return immutable view
        return List.copyOf(
            eventsByUser.getOrDefault(userId, List.of())
        );
    }

    public Map<String, Integer> getEventTypeCounts() {
        // Return immutable copy
        return Map.copyOf(eventTypeCounts);
    }

    public List<Event> getRecentEvents(int count) {
        int size = chronologicalEvents.size();
        int fromIndex = Math.max(0, size - count);
        return List.copyOf(chronologicalEvents.subList(fromIndex, size));
    }
}

// Usage
void demonstrateEventAggregator() {
    var aggregator = new EventAggregator();

    aggregator.record(new Event("login", LocalDateTime.now(), "user1"));
    aggregator.record(new Event("search", LocalDateTime.now(), "user1"));
    aggregator.record(new Event("purchase", LocalDateTime.now(), "user2"));

    // Fast user lookup via HashMap
    List<Event> userEvents = aggregator.getEventsForUser("user1");
    System.out.println("User1 events: " + userEvents.size()); // 2

    // Aggregated counts
    Map<String, Integer> counts = aggregator.getEventTypeCounts();
    System.out.println("Event type counts: " + counts);
}

Best Practices

  1. Prefer immutable collections for return types

    // Good
    public List<String> getTags() {
        return List.copyOf(tags);
    }
    
    // Avoid
    public List<String> getTags() {
        return tags; // exposes internal mutable state
    }
    
  2. Use appropriate initial capacity

    // If you know the size, specify capacity
    List<String> names = new ArrayList<>(1000);
    Map<String, Value> cache = new HashMap<>(500);
    
  3. Choose collection based on access patterns

    // Frequent insertions/deletions at ends -> LinkedList
    Deque<Task> queue = new LinkedList<>();
    
    // Fast lookup, no order -> HashSet
    Set<String> processed = new HashSet<>();
    
    // Sorted iteration -> TreeSet
    Set<Integer> sortedIds = new TreeSet<>();
    
  4. Use computeIfAbsent and merge for maps

    // Old way
    List<String> list = map.get(key);
    if (list == null) {
        list = new ArrayList<>();
        map.put(key, list);
    }
    list.add(value);
    
    // Modern way
    map.computeIfAbsent(key, k -> new ArrayList<>()).add(value);
    
    // Counting
    map.merge(key, 1, Integer::sum);
    

Common Pitfalls

1. Modifying collection while iterating

// Wrong - ConcurrentModificationException
List<String> items = new ArrayList<>(List.of("a", "b", "c"));
for (String item : items) {
    if (item.equals("b")) {
        items.remove(item); // throws exception
    }
}

// Correct - use iterator
Iterator<String> it = items.iterator();
while (it.hasNext()) {
    if (it.next().equals("b")) {
        it.remove();
    }
}

// Modern - use removeIf
items.removeIf(item -> item.equals("b"));

2. Using == instead of equals for collection elements

List<String> list = new ArrayList<>();
list.add(new String("hello"));
System.out.println(list.contains("hello")); // true (uses equals)
System.out.println(list.get(0) == "hello"); // false (different objects)

3. Forgetting that subList is a view

List<Integer> list = new ArrayList<>(List.of(1, 2, 3, 4, 5));
List<Integer> sub = list.subList(1, 3); // [2, 3]
list.remove(0); // Modifies underlying list
// sub is now invalid - throws ConcurrentModificationException on access