preface

Java provides a variety of container types, such as the Map class HashMap TreeMap, the Set class HashSet TreeSet, and the List class ArrayList LinkedList. However, some of these container classes require that the key cannot be null. Value cannot be null. This paper records some of their own thinking and speculation on these issues, if there is any wrong welcome criticism.

Map

First said the status quo

  • HashTableIn JDK1.0, neither key nor value can benull
  • HashMapIn JDK1.2, both key and value can benull
  • LinkedHashMapIntroduced in JDK1.4, both key and value can benull
  • TreeMapIntroduced in JDK1.2, key cannot benull, value can benull
  • ConcurrentHashMapIntroduced in JDK1.5, neither key nor value can benull

Besides, thinking

In JDK1.0, HashTable was introduced. Since HashTable is thread-safe and can be used by multiple threads, if value is allowed to be null, containsKey method must be called to determine whether a key is included or not. When the get method is called to obtain the corresponding value, since the HashTable can be used by multiple threads, other threads may modify the corresponding key during this period, resulting in changes in the return value, which may not exist. In order to deal with this, you have to lock the container key and get calls to be atomic, but this increases the user’s cost, and the interface designer doesn’t want this to happen, so value is not allowed to be null, If the return value is null, you can determine whether the corresponding key exists.

HashTable keys cannot be null, but the designer requires all keys to implement hashCode and equals methods. Only non-NULL objects in Java implement these two methods.

This class implements a hash table, which maps keys to values. Any non-null object can be used as a key or as a value.

To successfully store and retrieve objects from a hashtable, the objects used as keys must implement the hashCode method and the equals method.

Later in JDK1.2, HashMap and TreeMap were introduced, both of which are not thread-safe maps and therefore are typically operated in a single thread, regardless of thread-safety concerns. The containsKey method can be used to check whether there is a key problem, and the get method can be used to obtain the corresponding value. Therefore, the value can be null.

For a HashMap key, the designer might consider that the key is not null-free, and only the null key is treated in a special way. This is also true in the code.

For a TreeMap key, the key should implement the Comparable interface in the case of a no-argument construct and sort the key through the compareTo method, which cannot be called if the key is null.

In JDK1.4, LinkedHashMap was introduced, which is a Map that maintains a linked list structure and can be used to build LRU caches. It inherits from HashMap, so its key and value are identical to HashMap requirements and can be null.

A concurrent, thread-safe ConcurrentHashMap (ConcurrentHashMap) is introduced in JDK1.5, which is compatible with HashTable, so keys and values cannot be null

A hash table supporting full concurrency of retrievals and high expected concurrency for updates. This class obeys the same functional specification as java.util.Hashtable, and includes versions of methods corresponding to each method of Hashtable.

Finally analyze the source code

HashTable

HashTable, introduced in JDK1.0, is thread-safe and can be used in multithreaded scenarios

Analyzing the PUT operation

Method Entry first nulls value and creates corresponding Entry instance. Call the hashCode method of the key to get the hash value. The key must not be null

public synchronized V put(K key, V value) {
    // Make sure the value is not null
    if (value == null) {
        throw new NullPointerException();
    }

    // Makes sure the key is not already in the hashtable.Entry<? ,? > tab[] = table;int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length;
    @SuppressWarnings("unchecked")
    Entry<K,V> entry = (Entry<K,V>)tab[index];
    for(; entry ! =null ; entry = entry.next) {
        if ((entry.hash == hash) && entry.key.equals(key)) {
            V old = entry.value;
            entry.value = value;
            return old;
        }
    }

    addEntry(hash, key, value, index);
    return null;
}
Copy the code

Analyzing the GET operation

Call the hashCode method of the key to calculate the hash value. If no Entry instance corresponding to the key exists, null is returned. Therefore, you can determine whether the corresponding key exists by checking whether the returned value is null

public synchronized V get(Object key) { Entry<? ,? > tab[] = table;int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length;
    for(Entry<? ,? > e = tab[index] ; e ! =null ; e = e.next) {
        if ((e.hash == hash) && e.key.equals(key)) {
            return(V)e.value; }}return null;
}
Copy the code

HashMap

HashMap is introduced in JDK2.0

Analyzing the PUT operation

For keys, the hash method is first used to calculate the hash value of the key. If key == null, the hash value is 0; otherwise, the corresponding hashCode method is used to calculate the hash value. For a value, the value is stored in the internal Node class at putVal, and there is no special nullation.

public V put(K key, V value) {
    return putVal(hash(key), key, value, false.true);
}

static final int hash(Object key) {
    int h;
    return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                    break; p = e; }}if(e ! =null) { // existing mapping for key
            V oldValue = e.value;
            if(! onlyIfAbsent || oldValue ==null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}
Copy the code

Analyzing the GET operation

Return null if the inner class Node instance of the key does not exist, which means that the Map cannot determine whether the element is contained in the Map just by determining whether the return value is null

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}
Copy the code

LinkedHashMap

LinkedHashMap is introduced in JDK1.4, and the LinkedHashMap inherits from HashMap

Analyzing the GET operation

The afterNodeAccess method can be used for LRU cache

public V get(Object key) {
    Node<K,V> e;
    if ((e = getNode(hash(key), key)) == null)
        return null;
    if (accessOrder)
        afterNodeAccess(e);
    return e.value;
}
Copy the code

The PUT operation inherits directly from the parent class and is no longer parsed

TreeMap

JDK1.2 introduced TreeMap, which is an ordered Map that sorts using either a custom Comparator implementation or the key-based Comparable#compareTo method.

A Red-Black tree based NavigableMap implementation. The map is sorted according to the Comparable natural ordering of its keys, or by a Comparator provided at map creation time, depending on which constructor is used.

Analyzing the PUT operation

If the constructor does not provide a Comparator implementation, the key must not be null. Otherwise, the Comparable#compareTo method cannot be called. Value does not nullify and is used directly to construct the inner class Entry

public V put(K key, V value) {
    Entry<K,V> t = root;
    if (t == null) {
        compare(key, key); // type (and possibly null) check

        root = new Entry<>(key, value, null);
        size = 1;
        modCount++;
        return null;
    }
    int cmp;
    Entry<K,V> parent;
    // split comparator and comparable paths
    Comparator<? super K> cpr = comparator;
    if(cpr ! =null) {
        do {
            parent = t;
            cmp = cpr.compare(key, t.key);
            if (cmp < 0)
                t = t.left;
            else if (cmp > 0)
                t = t.right;
            else
                return t.setValue(value);
        } while(t ! =null);
    }
    else {
        if (key == null)
            throw new NullPointerException();
        @SuppressWarnings("unchecked")
            Comparable<? super K> k = (Comparable<? super K>) key;
        do {
            parent = t;
            cmp = k.compareTo(t.key);
            if (cmp < 0)
                t = t.left;
            else if (cmp > 0)
                t = t.right;
            else
                return t.setValue(value);
        } while(t ! =null);
    }
    Entry<K,V> e = new Entry<>(key, value, parent);
    if (cmp < 0)
        parent.left = e;
    else
        parent.right = e;
    fixAfterInsertion(e);
    size++;
    modCount++;
    return null;
}
Copy the code

Analyzing the GET operation

Obtain the Entry corresponding to the key. If no Entry corresponding to the key is obtained, null is returned. Therefore, you cannot determine whether the key exists only by checking whether the returned value is null

public V get(Object key) {
    Entry<K,V> p = getEntry(key);
    return (p==null ? null : p.value);
}

final Entry<K,V> getEntry(Object key) {
    // Offload comparator-based version for sake of performance
    if(comparator ! =null)
        return getEntryUsingComparator(key);
    if (key == null)
        throw new NullPointerException();
    @SuppressWarnings("unchecked")
        Comparable<? super K> k = (Comparable<? super K>) key;
    Entry<K,V> p = root;
    while(p ! =null) {
        int cmp = k.compareTo(p.key);
        if (cmp < 0)
            p = p.left;
        else if (cmp > 0)
            p = p.right;
        else
            return p;
    }
    return null;
}
Copy the code

ConcurrentHashMap

The ConcurrentHashMap introduced in JDK1.5 is thread-safe and can be used in multithreaded scenarios

Analyzing the PUT operation

The key and value cannot be null

public V put(K key, V value) {
    return putVal(key, value, false);
}

final V putVal(K key, V value, boolean onlyIfAbsent) {
    if (key == null || value == null) throw new NullPointerException();
    int hash = spread(key.hashCode());
    int binCount = 0;
    
    // omit the following
}

static final int spread(int h) {
    return (h ^ (h >>> 16)) & HASH_BITS;
}
Copy the code

Analyzing the GET operation

Call the key’s hashCode method to retrieve the hash value, or return NULL if no value exists

public V get(Object key) {
    Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
    int h = spread(key.hashCode());
    if((tab = table) ! =null && (n = tab.length) > 0 &&
        (e = tabAt(tab, (n - 1) & h)) ! =null) {
        if ((eh = e.hash) == h) {
            if((ek = e.key) == key || (ek ! =null && key.equals(ek)))
                return e.val;
        }
        else if (eh < 0)
            return(p = e.find(h, key)) ! =null ? p.val : null;
        while((e = e.next) ! =null) {
            if(e.hash == h && ((ek = e.key) == key || (ek ! =null && key.equals(ek))))
                returne.val; }}return null;
}
Copy the code

Set

First said the status quo

  • HashSetIntroduced in JDK1.2, elements can benull
  • TreeSetIntroduced in JDK1.2, elements cannot benull
  • LinkedHashSetIntroduced in JDK1.4, elements can benull

Besides, thinking

Set is implemented using a Map. The key of a Map consists of a Set, and the value of a Map is a fixed non-null value.

A HashSet corresponds to a HashMap, and since the key of a HashMap is allowed to be null, the element of a HashSet can be null.

TreeSet corresponds to TreeMap. The key of TreeMap cannot be null. Therefore, TreeSet elements cannot be null.

LinkedHashSet corresponds to LinkedHashMap. Since the key of the LinkedHashMap is allowed to be null, the element of the LinkedHashSet can be NULL.

Finally analyze the source code

HashSet

HashSet definition

public class HashSet<E>
    extends AbstractSet<E>
    implements Set<E>, Cloneable.java.io.Serializable {
    
    static final long serialVersionUID = -5024744406713321676L;

    private transient HashMap<E,Object> map;

    // Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();

    /** * Constructs a new, empty set; Backing HashMap instance has * default initial capacity (16) and load factor (0.75). */ backing HashMap instance has * default initial capacity (16) and load factor (0.75)
    public HashSet(a) {
        map = newHashMap<>(); }}Copy the code

Analyzing the Add operation

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}
Copy the code

Analyzing the Remove operation

public boolean remove(Object o) {
    return map.remove(o)==PRESENT;
}
Copy the code

LinkedHashSet

LinkedHashSet descends from HashSet, and the add and remove operations are consistent with the parent class

public class LinkedHashSet<E>
    extends HashSet<E>
    implements Set<E>, Cloneable.java.io.Serializable {

    private static final long serialVersionUID = -2851667679971038690L;

    /**
     * Constructs a new, empty linked hash set with the specified initial
     * capacity and load factor.
     *
     * @param      initialCapacity the initial capacity of the linked hash set
     * @param      loadFactor      the load factor of the linked hash set
     * @throws     IllegalArgumentException  if the initial capacity is less
     *               than zero, or if the load factor is nonpositive
     */
    public LinkedHashSet(int initialCapacity, float loadFactor) {
        super(initialCapacity, loadFactor, true);
    }

    /** * Constructs a new, Empty Linked hash set with the specified Initial * capacity and default load factor (0.75). * *@param   initialCapacity   the initial capacity of the LinkedHashSet
     * @throws  IllegalArgumentException if the initial capacity is less
     *              than zero
     */
    public LinkedHashSet(int initialCapacity) {
        super(initialCapacity, .75f.true);
    }

    Constructs a new, empty linked hash set with the default initial * capacity (16) and load factor (0.75). */ Constructs a new, empty linked hash set with the default initial * capacity (16) and load factor (0.75)
    public LinkedHashSet(a) {
        super(16..75f.true); }}Copy the code

TreeSet

TreeSet definition

public class TreeSet<E> extends AbstractSet<E>
    implements NavigableSet<E>, Cloneable.java.io.Serializable
{
    /** * The backing map. */
    private transient NavigableMap<E,Object> m;

    // Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();

    /** * Constructs a set backed by the specified navigable map. */
    TreeSet(NavigableMap<E,Object> m) {
        this.m = m;
    }

    /**
     * Constructs a new, empty tree set, sorted according to the
     * natural ordering of its elements.  All elements inserted into
     * the set must implement the {@link Comparable} interface.
     * Furthermore, all such elements must be <i>mutually
     * comparable</i>: {@code e1.compareTo(e2)} must not throw a
     * {@code ClassCastException} for any elements {@code e1} and
     * {@code e2} in the set.  If the user attempts to add an element
     * to the set that violates this constraint (for example, the user
     * attempts to add a string element to a set whose elements are
     * integers), the {@code add} call will throw a
     * {@code ClassCastException}.
     */
    public TreeSet(a) {
        this(new TreeMap<E,Object>());
    }

    /**
     * Constructs a new, empty tree set, sorted according to the specified
     * comparator.  All elements inserted into the set must be <i>mutually
     * comparable</i> by the specified comparator: {@code comparator.compare(e1,
     * e2)} must not throw a {@code ClassCastException} for any elements
     * {@code e1} and {@code e2} in the set.  If the user attempts to add
     * an element to the set that violates this constraint, the
     * {@code add} call will throw a {@code ClassCastException}.
     *
     * @param comparator the comparator that will be used to order this set.
     *        If {@code null}, the {@linkplain Comparable natural
     *        ordering} of the elements will be used.
     */
    public TreeSet(Comparator<? super E> comparator) {
        this(new TreeMap<>(comparator));
    }

Copy the code

Analyzing the Add operation

public boolean add(E e) {
    return m.put(e, PRESENT)==null;
}
Copy the code

Analyzing the Remove operation

public boolean remove(Object o) {
    return m.remove(o)==PRESENT;
}
Copy the code

List

First said the status quo

Vector JDK1.0 introduced a thread-safe List, where elements can be null

ArrayList introduced in JDK1.2, a non-thread-safe List whose elements can be null

LinkedList introduced in JDK1.2, an off-site secure List where elements can be null

Besides, thinking

List also introduced thread-safe container classes in JDK1.0, but List is an ordered sequence container, which usually uses an index to fetch elements. As long as the index does not cross the boundary, the corresponding element must exist. There is no need to use NULL to determine the existence of an element. So elements can all be null.

Finally analyze the source code

Vector

At the bottom of a Vector is an array

public class Vector<E>
    extends AbstractList<E>
    implements List<E>, RandomAccess.Cloneable.java.io.Serializable
{
    /** * The array buffer into which the components of the vector are * stored. The capacity of the vector is the length of  this array buffer, * and is at least large enough to contain all the vector's elements. * * <p>Any array elements following the last element in the Vector are null. * *@serial* /
    protected Object[] elementData;
}
Copy the code

Analyzing the Add operation

Null validation is not performed on added elements, allowing null

public synchronized boolean add(E e) {
    modCount++;
    ensureCapacityHelper(elementCount + 1);
    elementData[elementCount++] = e;
    return true;
}
Copy the code

Analyzing the GET operation

Fetching the element value of the array subscript directly means that there is no way to determine whether the element is in the List by determining whether it is null.

public synchronized E get(int index) {
    if (index >= elementCount)
        throw new ArrayIndexOutOfBoundsException(index);

    return elementData(index);
}

E elementData(int index) {
    return (E) elementData[index];
}
Copy the code

To determine whether an element is contained, use the contains method.

public boolean contains(Object o) {
    return indexOf(o, 0) > =0;
}

public synchronized int indexOf(Object o, int index) {
    if (o == null) {
        for (int i = index ; i < elementCount ; i++)
            if (elementData[i]==null)
                return i;
    } else {
        for (int i = index ; i < elementCount ; i++)
            if (o.equals(elementData[i]))
                return i;
    }
    return -1;
}
Copy the code

ArrayList

An ArrayList is an array at the bottom

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess.Cloneable.java.io.Serializable {
    private static final long serialVersionUID = 8683452581122892189L;

    /** * Default initial capacity. */
    private static final int DEFAULT_CAPACITY = 10;

    /** * The array buffer into which the elements of the ArrayList are stored. * The capacity of the ArrayList is the length of this array buffer. Any * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA * will be expanded to DEFAULT_CAPACITY when the first element is added. */
    transient Object[] elementData; // non-private to simplify nested class access
}
Copy the code

Analyzing the Add operation

Null validation is not performed on added elements, allowing null

public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}
Copy the code

Analyzing the GET operation

Fetching the element value of the array subscript directly means that there is no way to determine whether the element is in the List by determining whether it is null.

public E get(int index) {
    rangeCheck(index);

    return elementData(index);
}

E elementData(int index) {
    return (E) elementData[index];
}
Copy the code

LinkedList

The underlying LinkedList is a LinkedList structure

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable.java.io.Serializable {
    transient int size = 0;

    /** * Pointer to first node. * Invariant: (first == null && last == null) || * (first.prev == null && first.item ! = null) */
    transient Node<E> first;

    /** * Pointer to last node. * Invariant: (first == null && last == null) || * (last.next == null && last.item ! = null) */
    transient Node<E> last;
}
Copy the code

Analyzing the Add operation

Null validation is not performed on added elements, allowing null

public boolean add(E e) {
    linkLast(e);
    return true;
}

void linkLast(E e) {
    final Node<E> l = last;
    final Node<E> newNode = new Node<>(l, e, null);
    last = newNode;
    if (l == null)
        first = newNode;
    else
        l.next = newNode;
    size++;
    modCount++;
}
Copy the code

Analyzing the GET operation

Fetching the element value of the array subscript directly means that there is no way to determine whether the element is in the List by determining whether it is null.

public E get(int index) {
    rangeCheck(index);

    return elementData(index);
}

E elementData(int index) {
    return (E) elementData[index];
}
Copy the code