If you look closely, and I’m going to be very careful, I’m going to explain how a HashSet stores fetch elements. A HashSet implements the Set interface.

One feature of the Set interface is that elements must be unique, meaning that no two elements can be stored the same. Hashsets inherit this, too. Each Set implementation class has its own way of determining whether an element is unique. A HashSet determines whether an element is the same through its hashCode() and equals() methods.

The code is simple, but for simplicity, I’m not adding generics

Set set=new HashSet();
set.add("java");
Copy the code

First of all, a HashSet is basically implemented using a HashMap, and it ends up using arrays. It doesn’t matter if you don’t understand it, it’s good to know the underlying implementation with arrays. Get a HashSet object and store a” Java “string. Now let’s see how HashSet stores this “Java”. Let’s say I have an array of length 8.

First, the hashCode value is computed by calling the hashCode() method of the “Java” string. If the resulting structure is 5, “Java” is stored in array 5.

Well, some of the complex operations stored are explained in one sentence.

Here comes the fun part

Now I’m going to store a “PHP” string.

Set set=new HashSet();
set.add("java");
set.add("php");
Copy the code

Similarly, call the hashCode() method of “PHP”. If, I still get 5. So let’s put “PHP” at position 5. PHP “go to position 5 and see, where is my position? What’s the best language in the world? Of course, you can’t stop saving PHP. To solve this situation, a linked list (Node) is used as a data structure. In addition to storing its own data, your Node also stores the address of the next data (also known as a pointer). It’s just storing addresses. That’s easy… Array 5 contains a Node that holds both the “PHP” string object and the address of another Node that holds the “Java” string object. Remember, just the address. So now you’re all stored in array 5. There are now two nodes, one containing “PHP” and one containing “Java”. And the Node that stores “PHP” also stores the address of the Node that stores “Java”. It is easy to find the “Java” Node at this time, some people may ask, why save “PHP” and then save “Java”? It’s not because PHP is the best language in the world. It’s called headplugging. This is because the designers of HashMap (HashSet uses HashMap) think that newly saved data is more likely to be used, that’s all.

What if I want to look up “Java”?

Well, that’s not easy. Let’s calculate the hashCode of “Java”, and once we calculate it, it’s equal to 5. Let’s go to array 5. Look, eh?” php”? It’s not what I was looking for, so I went to see if the Node that saved “PHP” pointed to another Node, and sure enough, it pointed to another Node. When I looked at the other Node, it saved “Java”, and then I found it. Looking up “PHP” is even easier, just by calculating the hashCode value of “PHP”. This is one of the reasons why hashsets and HashMaps are efficient, but more on that later. The position is determined by directly evaluating the element’s hashCode value. And then you just do it directly. But what you don’t see is, if you store a lot of elements in one place, it becomes a little bit of a hassle to find them. If all 10 elements are stored in position 5, the structure looks like this: Node1–>Node2–>Node3–>… — > Node10. If I had to find the elements stored in Node10, it would be cumbersome and inefficient. It defeats the purpose of the Hash system. To avoid this, make each location of the array equally likely to be stored. What if I make each location equally likely to be stored? The problem is, even if you store elements in every location, if I continue to store elements, there will always be multiple elements in one location. How can I solve this problem?

How do you try to make every location equally likely to be stored?

Free updates

What if the element is larger than the capacity?

Free updates

How do you guarantee the uniqueness of the elements?

Free updates