How to filter the incoming object of HashSet
First, we know that all sets have a Map inside them. The Key of the Map is used to store values, and the Value is a fixed Object. This is the adapter mode. The source code is as follows:
public class HashSet<E>
extends AbstractSet<E>
implements Set<E>, Cloneable.java.io.Serializable
{
static final long serialVersionUID = -5024744406713321676L;
private transient HashMap<E,Object> map;
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
/** * Constructs a new, empty set; Backing HashMap instance has * default initial capacity (16) and load factor (0.75). */ backing HashMap instance has * default initial capacity (16) and load factor (0.75)
public HashSet(a) {
map = new HashMap<>();
}
/**
* Constructs a new set containing the elements in the specified
* collection. The <tt>HashMap</tt> is created with default load factor
* (0.75) and an initial capacity sufficient to contain the elements in
* the specified collection.
*
* @param c the collection whose elements are to be placed into this set
* @throws NullPointerException if the specified collection is null
*/
public HashSet(Collection<? extends E> c) {
map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1.16));
addAll(c);
}
/**
* Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
* the specified initial capacity and the specified load factor.
*
* @param initialCapacity the initial capacity of the hash map
* @param loadFactor the load factor of the hash map
* @throws IllegalArgumentException if the initial capacity is less
* than zero, or if the load factor is nonpositive
*/
public HashSet(int initialCapacity, float loadFactor) {
map = new HashMap<>(initialCapacity, loadFactor);
}
/** * Constructs a new, empty set; The backing <tt>HashMap</tt> instance has * the specified initial capacity and default load factor (0.75)@param initialCapacity the initial capacity of the hash table
* @throws IllegalArgumentException if the initial capacity is less
* than zero
*/
public HashSet(int initialCapacity) {
map = new HashMap<>(initialCapacity);
}
/**
* Constructs a new, empty linked hash set. (This package private
* constructor is only used by LinkedHashSet.) The backing
* HashMap instance is a LinkedHashMap with the specified initial
* capacity and the specified load factor.
*
* @param initialCapacity the initial capacity of the hash map
* @param loadFactor the load factor of the hash map
* @param dummy ignored (distinguishes this
* constructor from other int, float constructor.)
* @throws IllegalArgumentException if the initial capacity is less
* than zero, or if the load factor is nonpositive
*/
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
map = newLinkedHashMap<>(initialCapacity, loadFactor); }... }Copy the code
So the question is, if two objects are passed in, what determines whether they are equal, does the old one replace the new one, and how or not?
The add method
Let’s look at the Add method
private static final Object PRESENT = new Object();
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
Copy the code
What is actually called is the HashMap add method, but value is called as a global variable PRESENT. In this case map is a global variable pointing to the HashMap object that was created when the HashSet object was created.
In addition, the return type of the add method is Boolean, so if the return value is NULL, the add is successful.
The exploration.add method is also the exploration.put method.
public V put(K key, V value) {
return putVal(hash(key), key, value, false.true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
break; p = e; }}if(e ! =null) { // existing mapping for key
V oldValue = e.value;
if(! onlyIfAbsent || oldValue ==null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}
Copy the code
Let’s analyze the source code with an example.
HashSet<String> set = new HashSet<>();
set.add("Tom");
set.add("Tom");
set.add("new String Tom");
Copy the code
Procedure for adding elements for the first time :set.add(” Tom “)
First define Node<K,V>[] TAB; Meaning is to create a named TAB of the Node Node set, then the if ((TAB = table) = = null | | (n = TAB. Length) = = 0) contrast TAB and whether the table is empty, N = (TAB = resize()).length; n = (TAB = resize()).length; The resize method returns newTab, which has been replaced with an array of length 16. N is equal to the maximum length of TAB is 16. P = TAB [I = (n-1) &hash]) == null p = TAB [I = (n-1) &hash]) == null p = TAB [I = (n-1) &hash]) == null TAB [I] = newNode(hash, key, value, null); Execute the code in the diagram (the last part of the source code) without else
The return value is null, so the put method completes and returns null, and then the Boolean says turn,.add completes.
Set.add (” Tom “) when the same element is added; How to enforce it?
Let’s discuss the second case: how does the add method determine if the same element is added:
HashSet<String> set = new HashSet<>();
set.add("Tom");
set.add("Tom");
Copy the code
Now that I’ve added a Tom element to my set, I’m going to add another Tom, and this is how putVal runs, First perform the if ((TAB = table) = = null | | (n = TAB. Length) = = 0) at the table has been adding elements for the first time assignment so is not null, TAB. The length is not obviously is zero, So if (p = TAB [I = (n-1) & hash]) == null) because the TAB element is the same as the TAB element, the TAB element is the same, and the address is the same, so the address is not null. So we’re going to do an else if, and we’re going to read the following statement
if(p.hash == hash &&((k = p.key) == key || (key ! =null && key.equals(k))))
Copy the code
P refers to TAB [] for the first time. Since the added elements are the same, the hash and key are the same, so the judgment is successful and the next step is e=p. E points to p, p points to TAB [I],
if(e ! =null) { // existing mapping for key
V oldValue = e.value;
if(! onlyIfAbsent || oldValue ==null)
e.value = value;
afterNodeAccess(e);
returnoldValue; }}Copy the code
To check that e is not null, run V oldValue = e.value; , then judge if (! OnlyIfAbsent | | oldValue = = null), onlyIfAbsent is false, so! OnlyIfAbsent is true, so the return value is oldValue, this is put and the return value received is not null so add fails. Set. Add (” new String Tom “); The execution process of
HashSet<String> set = new HashSet<>();
set.add("Tom");
set.add(new String("Tom"));// Consider: how to judge not to allow repetition?
Copy the code
The first element is added by default, and the second element is analyzed directly. And before we do that let’s look at the following program,
public static void main(String[] args) {
String name1="Tom";
System.out.println(name1.hashCode());
String name2=new String("Tom");
System.out.println(name2.hashCode());
System.out.println(name1==name2);
}
Copy the code
Output result:
At this point we know that even though == will run false, hashCode will find the same address. Now add a second String of new “Tom”, first determines the if ((TAB = table) = = null | | (n = TAB. Length) = = 0) for the first time to add the element, the table is not empty, and n are not zero, N = (TAB = resize()).length); If ((p = TAB [I = (n-1) & hash]) == null), hash = (I = TAB [I = (n-1) & hash]) == null), hash = (I = TAB [I = (n-1) & hash]) == null
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
Copy the code
If (p == hash) {if (p == hash) {if (p == hash) {if (p == hash) {if (p == hash) { So e points to TAB [I], then execute the following program:
if(e ! =null) { // existing mapping for key
V oldValue = e.value;
if(! onlyIfAbsent || oldValue ==null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
Copy the code
E refers to p, so e is not null. Assign e.value to oldValue, and check that onlyIfAbsent is not null. Overwrite the old Tom’s value with the new Tom’s value. The last value returned is oldValue, which is a constant, so it’s not null, and put is accepted as not null so add will not succeed.
Others: Override the hashCode and equals methods to allow the student class to pass the ID attribute without adding the same ID.
First step: Rewrite the hashCode method to see if it works:
public class Student {
private String id;
public Student(String id) {
this.id = id;
}
@Override
public int hashCode(a) {
return id.hashCode();
}
public static void main(String[] args) {
HashSet<Student> set =new HashSet<Student>();
set.add(new Student("110"));
set.add(new Student("110")); }}Copy the code
Running the code finds that it is ok, adds it, and prints set.seize (); Also is 2.
The hashCode() values of the above two Student objects are different, so the addition succeeds.
Now let’s analyze why the second time will add a success:
if ((tab = table) == null || (n = tab.length) == 0)// TAB is not empty and length is not 0, so the next if
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)//p is the last added Student and obviously not null, else
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))// The hash value is the same, but the key is different, so the key is not empty, but the key and k (p.key) are different, so the next else if
e = p;
else if (p instanceof TreeNode)// Clearly not correct, so next else
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {// There is no limit on binCount, so it is an infinite loop
if ((e = p.next) == null) {// where p.next points to e
p.next = newNode(hash, key, value, null);// Save the second object here
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);// This code, whether executed or not, will execute the next break, breaking the loop.
break;
}
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
break; p = e; }}if(e ! =null) { // Existing mapping for key// Not null, so execute down
V oldValue = e.value;
if(! onlyIfAbsent || oldValue ==null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;// The value returned is null, so the add is successful!
Copy the code
If this is the case, the student class will not be able to use the id attribute to control not adding the same ID, so how can we optimize?
So when YOU look at this, you realize that if adding the same element didn’t work because the key was the same at the time, but how can we change it to make the condition work by overriding equals by checking that the ID is the same?
Public Boolean equals(Object obj) {public Boolean equals(Object obj) { If (obj instanceof Student) {// If (obj instanceof Student); // Subclass object = (subclass) superclass object. return this.id.equals(stu.id); } return false; }Copy the code