How to filter the incoming object of HashSet

First, we know that all sets have a Map inside them. The Key of the Map is used to store values, and the Value is a fixed Object. This is the adapter mode. The source code is as follows:

public class HashSet<E>
    extends AbstractSet<E>
    implements Set<E>, Cloneable.java.io.Serializable
{
    static final long serialVersionUID = -5024744406713321676L;

    private transient HashMap<E,Object> map;

    // Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();

    /** * Constructs a new, empty set; Backing HashMap instance has * default initial capacity (16) and load factor (0.75). */ backing HashMap instance has * default initial capacity (16) and load factor (0.75)
    public HashSet(a) {
        map = new HashMap<>();
    }

    /**
     * Constructs a new set containing the elements in the specified
     * collection.  The <tt>HashMap</tt> is created with default load factor
     * (0.75) and an initial capacity sufficient to contain the elements in
     * the specified collection.
     *
     * @param c the collection whose elements are to be placed into this set
     * @throws NullPointerException if the specified collection is null
     */
    public HashSet(Collection<? extends E> c) {
        map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1.16));
        addAll(c);
    }

    /**
     * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
     * the specified initial capacity and the specified load factor.
     *
     * @param      initialCapacity   the initial capacity of the hash map
     * @param      loadFactor        the load factor of the hash map
     * @throws     IllegalArgumentException if the initial capacity is less
     *             than zero, or if the load factor is nonpositive
     */
    public HashSet(int initialCapacity, float loadFactor) {
        map = new HashMap<>(initialCapacity, loadFactor);
    }

    /** * Constructs a new, empty set; The backing <tt>HashMap</tt> instance has * the specified initial capacity and default load factor (0.75)@param      initialCapacity   the initial capacity of the hash table
     * @throws     IllegalArgumentException if the initial capacity is less
     *             than zero
     */
    public HashSet(int initialCapacity) {
        map = new HashMap<>(initialCapacity);
    }

    /**
     * Constructs a new, empty linked hash set.  (This package private
     * constructor is only used by LinkedHashSet.) The backing
     * HashMap instance is a LinkedHashMap with the specified initial
     * capacity and the specified load factor.
     *
     * @param      initialCapacity   the initial capacity of the hash map
     * @param      loadFactor        the load factor of the hash map
     * @param      dummy             ignored (distinguishes this
     *             constructor from other int, float constructor.)
     * @throws     IllegalArgumentException if the initial capacity is less
     *             than zero, or if the load factor is nonpositive
     */
    HashSet(int initialCapacity, float loadFactor, boolean dummy) {
        map = newLinkedHashMap<>(initialCapacity, loadFactor); }... }Copy the code

So the question is, if two objects are passed in, what determines whether they are equal, does the old one replace the new one, and how or not?

The add method

Let’s look at the Add method

private static final Object PRESENT = new Object();

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}
Copy the code

What is actually called is the HashMap add method, but value is called as a global variable PRESENT. In this case map is a global variable pointing to the HashMap object that was created when the HashSet object was created.

In addition, the return type of the add method is Boolean, so if the return value is NULL, the add is successful.

The exploration.add method is also the exploration.put method.

    public V put(K key, V value) {
        return putVal(hash(key), key, value, false.true);
    }

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                        break; p = e; }}if(e ! =null) { // existing mapping for key
                V oldValue = e.value;
                if(! onlyIfAbsent || oldValue ==null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }
Copy the code

Let’s analyze the source code with an example.

HashSet<String> set = new HashSet<>();
set.add("Tom");
set.add("Tom");
set.add("new String Tom");
Copy the code

Procedure for adding elements for the first time :set.add(” Tom “)

First define Node<K,V>[] TAB; Meaning is to create a named TAB of the Node Node set, then the if ((TAB = table) = = null | | (n = TAB. Length) = = 0) contrast TAB and whether the table is empty, N = (TAB = resize()).length; n = (TAB = resize()).length; The resize method returns newTab, which has been replaced with an array of length 16. N is equal to the maximum length of TAB is 16. P = TAB [I = (n-1) &hash]) == null p = TAB [I = (n-1) &hash]) == null p = TAB [I = (n-1) &hash]) == null TAB [I] = newNode(hash, key, value, null); Execute the code in the diagram (the last part of the source code) without else



The return value is null, so the put method completes and returns null, and then the Boolean says turn,.add completes.

Set.add (” Tom “) when the same element is added; How to enforce it?

Let’s discuss the second case: how does the add method determine if the same element is added:

HashSet<String> set = new HashSet<>();
set.add("Tom");
set.add("Tom");
Copy the code

Now that I’ve added a Tom element to my set, I’m going to add another Tom, and this is how putVal runs, First perform the if ((TAB = table) = = null | | (n = TAB. Length) = = 0) at the table has been adding elements for the first time assignment so is not null, TAB. The length is not obviously is zero, So if (p = TAB [I = (n-1) & hash]) == null) because the TAB element is the same as the TAB element, the TAB element is the same, and the address is the same, so the address is not null. So we’re going to do an else if, and we’re going to read the following statement

if(p.hash == hash &&((k = p.key) == key || (key ! =null && key.equals(k)))) 
Copy the code

P refers to TAB [] for the first time. Since the added elements are the same, the hash and key are the same, so the judgment is successful and the next step is e=p. E points to p, p points to TAB [I],

	if(e ! =null) { // existing mapping for key
		V oldValue = e.value;
		if(! onlyIfAbsent || oldValue ==null)
			e.value = value;
		afterNodeAccess(e);
		returnoldValue; }}Copy the code

To check that e is not null, run V oldValue = e.value; , then judge if (! OnlyIfAbsent | | oldValue = = null), onlyIfAbsent is false, so! OnlyIfAbsent is true, so the return value is oldValue, this is put and the return value received is not null so add fails. Set. Add (” new String Tom “); The execution process of

HashSet<String> set = new HashSet<>();
set.add("Tom");
set.add(new String("Tom"));// Consider: how to judge not to allow repetition?
Copy the code

The first element is added by default, and the second element is analyzed directly. And before we do that let’s look at the following program,

	public static void main(String[] args) {
		String name1="Tom";
		System.out.println(name1.hashCode());
		String name2=new String("Tom");
		System.out.println(name2.hashCode());
		System.out.println(name1==name2);
	}
Copy the code

Output result:

At this point we know that even though == will run false, hashCode will find the same address. Now add a second String of new “Tom”, first determines the if ((TAB = table) = = null | | (n = TAB. Length) = = 0) for the first time to add the element, the table is not empty, and n are not zero, N = (TAB = resize()).length); If ((p = TAB [I = (n-1) & hash]) == null), hash = (I = TAB [I = (n-1) & hash]) == null), hash = (I = TAB [I = (n-1) & hash]) == null

if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
Copy the code

If (p == hash) {if (p == hash) {if (p == hash) {if (p == hash) {if (p == hash) { So e points to TAB [I], then execute the following program:

	 if(e ! =null) { // existing mapping for key
		V oldValue = e.value;
		if(! onlyIfAbsent || oldValue ==null)
			e.value = value;
		afterNodeAccess(e);
		return oldValue;
	}
Copy the code

E refers to p, so e is not null. Assign e.value to oldValue, and check that onlyIfAbsent is not null. Overwrite the old Tom’s value with the new Tom’s value. The last value returned is oldValue, which is a constant, so it’s not null, and put is accepted as not null so add will not succeed.

Others: Override the hashCode and equals methods to allow the student class to pass the ID attribute without adding the same ID.

First step: Rewrite the hashCode method to see if it works:

public class Student {
	private String id;
	public Student(String id) {
		this.id = id;
	}
	@Override
	public int hashCode(a) {
		return id.hashCode();
	}
	public static void main(String[] args) {
		HashSet<Student> set =new HashSet<Student>();
		set.add(new Student("110"));
		set.add(new Student("110")); }}Copy the code

Running the code finds that it is ok, adds it, and prints set.seize (); Also is 2.

The hashCode() values of the above two Student objects are different, so the addition succeeds.

Now let’s analyze why the second time will add a success:

		if ((tab = table) == null || (n = tab.length) == 0)// TAB is not empty and length is not 0, so the next if
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)//p is the last added Student and obviously not null, else
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))// The hash value is the same, but the key is different, so the key is not empty, but the key and k (p.key) are different, so the next else if
                e = p;
            else if (p instanceof TreeNode)// Clearly not correct, so next else
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {// There is no limit on binCount, so it is an infinite loop
                    if ((e = p.next) == null) {// where p.next points to e
                        p.next = newNode(hash, key, value, null);// Save the second object here
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);// This code, whether executed or not, will execute the next break, breaking the loop.
                        break;
                    }
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                        break; p = e; }}if(e ! =null) { // Existing mapping for key// Not null, so execute down
                V oldValue = e.value;
                if(! onlyIfAbsent || oldValue ==null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;// The value returned is null, so the add is successful!
Copy the code

If this is the case, the student class will not be able to use the id attribute to control not adding the same ID, so how can we optimize?

So when YOU look at this, you realize that if adding the same element didn’t work because the key was the same at the time, but how can we change it to make the condition work by overriding equals by checking that the ID is the same?

Public Boolean equals(Object obj) {public Boolean equals(Object obj) { If (obj instanceof Student) {// If (obj instanceof Student); // Subclass object = (subclass) superclass object. return this.id.equals(stu.id); } return false; }Copy the code