The problem
(1) What is the difference between a Collection and a Set?
(2) How to ensure that the elements added to the HashSet are not repeated?
(3) Does HashSet allow null elements?
(4) Is HashSet ordered?
(5) Is HashSet synchronous?
(6) What is fail-fast?
Introduction to the
Set, the concept is a little fuzzy.
Broadly speaking, collections in Java refer to the container classes under the java.util package, including all classes related to Collections and maps.
In terms of meaning, we generally refer to Collection classes in Java collections, not Map classes.
In a narrow sense, a mathematical Set refers to a container that does not contain duplicate elements, that is, there are no two identical elements in the Set, which correspond to a Set in Java.
Specific how to understand or to look at the context.
For example, if you’re being asked to talk about collections in Java, you’re probably speaking broadly.
For example, adding all elements from another Set to a Set is neutral.
A HashSet is an implementation of a Set, and the underlying use of a HashMap is to ensure that elements are not duplicated.
Source code analysis
attribute
// Use HashMap internally
private transient HashMap<E,Object> map;
// Virtual object, used as value in map
private static final Object PRESENT = new Object();
Copy the code
A constructor
public HashSet(a) {
map = new HashMap<>();
}
public HashSet(Collection<? extends E> c) {
map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1.16));
addAll(c);
}
public HashSet(int initialCapacity, float loadFactor) {
map = new HashMap<>(initialCapacity, loadFactor);
}
public HashSet(int initialCapacity) {
map = new HashMap<>(initialCapacity);
}
// Non-public, mainly used for LinkedHashSet
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
map = new LinkedHashMap<>(initialCapacity, loadFactor);
}
Copy the code
Constructors call the corresponding constructors of the HashMap.
The last constructor is a bit special in that it is not public, meaning it can only be called by the same package or subclass, which is unique to LinkedHashSet.
Add elements
Call the put() method of the HashMap directly, using the element itself as the key and PRESENT as the value, meaning that all values in the map are the same.
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
Copy the code
Remove elements
Call the remove() method of the HashMap directly. Note that the map remove returns a value for the removed element, while the Set remov returns a Boolean.
And just to check, if it’s null it means there’s no element, but if it’s not null it equals PRESENT.
public boolean remove(Object o) {
return map.remove(o)==PRESENT;
}
Copy the code
Query element
Set doesn’t have a get() method because get doesn’t seem to make sense, unlike a List where you can get elements by index.
There’s only one method that checks for the presence of the element, contains(), which directly calls the map’s containsKey() method.
public boolean contains(Object o) {
return map.containsKey(o);
}
Copy the code
Traverse elements
The iterator to the map’s keySet is called directly.
public Iterator<E> iterator(a) {
return map.keySet().iterator();
}
Copy the code
All the source code
package java.util;
import java.io.InvalidObjectException;
import sun.misc.SharedSecrets;
public class HashSet<E>
extends AbstractSet<E>
implements Set<E>, Cloneable.java.io.Serializable
{
static final long serialVersionUID = -5024744406713321676L;
// The inner elements are stored in the HashMap
private transient HashMap<E,Object> map;
// The virtual element is stored in the value of the map element
private static final Object PRESENT = new Object();
// Empty constructor
public HashSet(a) {
map = new HashMap<>();
}
// Add all the elements of another Set to the current Set
// Note that the map is initialized with its initial capacity calculated
public HashSet(Collection<? extends E> c) {
map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1.16));
addAll(c);
}
// Specify the initial capacity and load factor
public HashSet(int initialCapacity, float loadFactor) {
map = new HashMap<>(initialCapacity, loadFactor);
}
// Specify only the initial capacity
public HashSet(int initialCapacity) {
map = new HashMap<>(initialCapacity);
}
// LinkedHashSet special method
// dummy has no real meaning, just to keep up with the above method signature
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
map = new LinkedHashMap<>(initialCapacity, loadFactor);
}
/ / the iterator
public Iterator<E> iterator(a) {
return map.keySet().iterator();
}
// Number of elements
public int size(a) {
return map.size();
}
// Check if it is null
public boolean isEmpty(a) {
return map.isEmpty();
}
// Check whether an element is included
public boolean contains(Object o) {
return map.containsKey(o);
}
// Add elements
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
// Delete elements
public boolean remove(Object o) {
return map.remove(o)==PRESENT;
}
// Clear all elements
public void clear(a) {
map.clear();
}
// Clone method
@SuppressWarnings("unchecked")
public Object clone(a) {
try {
HashSet<E> newSet = (HashSet<E>) super.clone();
newSet.map = (HashMap<E, Object>) map.clone();
return newSet;
} catch (CloneNotSupportedException e) {
throw newInternalError(e); }}// serialize the write method
private void writeObject(java.io.ObjectOutputStream s)
throws java.io.IOException {
// Write a non-static, non-transient property
s.defaultWriteObject();
// Write out the map's capacity and load factor
s.writeInt(map.capacity());
s.writeFloat(map.loadFactor());
// Write the number of elements
s.writeInt(map.size());
// Iterate over all the elements
for (E e : map.keySet())
s.writeObject(e);
}
// serialize the read in method
private void readObject(java.io.ObjectInputStream s)
throws java.io.IOException, ClassNotFoundException {
// Read a non-static, non-transient property
s.defaultReadObject();
// Read in the capacity and check that it cannot be less than 0
int capacity = s.readInt();
if (capacity < 0) {
throw new InvalidObjectException("Illegal capacity: " +
capacity);
}
// Read the load factor and check that it cannot be less than or equal to 0 or NaN(Not a Number)
// java.lang.Float.NaN = 0.0f / 0.0f;
float loadFactor = s.readFloat();
if (loadFactor <= 0 || Float.isNaN(loadFactor)) {
throw new InvalidObjectException("Illegal load factor: " +
loadFactor);
}
// Read in the number of elements and check that it cannot be less than 0
int size = s.readInt();
if (size < 0) {
throw new InvalidObjectException("Illegal size: " +
size);
}
// Reset the capacity based on the number of elements
// This is to ensure that the map has enough capacity to accommodate all elements and prevent unnecessary expansion
capacity = (int) Math.min(size * Math.min(1 / loadFactor, 4.0 f),
HashMap.MAXIMUM_CAPACITY);
// Double check something, ignore unimportant code
SharedSecrets.getJavaOISAccess()
.checkArray(s, Map.Entry[].class, HashMap.tableSizeFor(capacity));
// Create a map and check if it is of type LinkedHashSetmap = (((HashSet<? >)this) instanceof LinkedHashSet ?
new LinkedHashMap<E,Object>(capacity, loadFactor) :
new HashMap<E,Object>(capacity, loadFactor));
// Read in all the elements and place them in the map
for (int i=0; i<size; i++) {
@SuppressWarnings("unchecked") E e = (E) s.readObject(); map.put(e, PRESENT); }}// Divisible iterator, mainly used for multi-threaded parallel iteration processing
public Spliterator<E> spliterator(a) {
return new HashMap.KeySpliterator<E,Object>(map, 0, -1.0.0); }}Copy the code
conclusion
(1) Inside the HashSet, the key of the HashMap is used to store elements, so as to ensure that elements are not repeated;
(2) HashSet is unordered because the key of a HashMap is unordered;
(3) A null element is allowed in a HashSet because a HashMap allows a null key;
(4) HashSet is not thread-safe;
(5) HashSet does not have a get() method;
eggs
(1) Ali manual says, when using Java collection to specify the size of their own collection, through this source analysis, you know how to initialize HashMap initial capacity?
We find that the following constructor tells us very clearly how to specify capacity.
If we expect a HashMap to store n elements, its capacity should be specified as (n/0.75f) + 1. If this value is less than 16, we use 16 instead.
The specified capacity during initialization reduces capacity expansion times and improves efficiency.
public HashSet(Collection<? extends E> c) {
map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1.16));
addAll(c);
}
Copy the code
(2) What is fail-fast?
The Fail-fast mechanism is a bug mechanism in Java collections.
When using iterators iteration, if found changes, set the failure to respond quickly, throw ConcurrentModificationException.
Such changes can be made by other threads or by the current thread itself, such as by calling remove() to remove elements during iteration.
Also, not all collections in Java have fail-fast mechanisms. For example, final-consistent ConcurrentHashMap, CopyOnWriterArrayList, etc., do not have fast-fail.
So how does Fail-Fast come about?
If you are careful, you may notice that there is an attribute called “modCount” in both ArrayList and HashMap. The value is incremented by 1 each time the set is modified, and this value is recorded in the expectedModCount before traversal to check whether the two are consistent during the traversal. If there is any inconsistency, it indicates that there is modification. Throw ConcurrentModificationException.
Welcome to pay attention to my public number “Tong Elder brother read source code”, view more source code series articles, with Tong elder brother tour the ocean of source code.