Make writing a habit together! This is the fourth day of my participation in the “Gold Digging Day New Plan · April More text Challenge”. Click here for more details.

background

  1. Reading the basics of magic numbers is recommended. Magic number 0x61C88647 Learn

  2. Question: Is it possible to use this magic number to resolve hash collisions?

  3. Question: Does the ring array design have any effect on generating hash collisions? If so, is it positive or negative?

process

  • Magic number working process and its characteristics
  1. The test code

  2. The results of

  3. conclusion

    Note how to calculate: hashCode & (length-1)

    The length is 16, 0 to 15, and the result of the hash is uniform, perfect, no repetition. And 16,17 found that the cycle began at the beginning.

  • How do two ThreadLocal instances work?
  1. code

  1. instructions

    The figure above shows two ThreadLocal instances, num and num2. Where the initial values, we set them all to 0.

  2. thinking

    We created num and num2 using new ThreadLocal(), and these two instances are definitely different, that’s for sure. If this is the case, then num’s attribute threadLocalHashCode has a value of 0, as does num2’s attribute threadLocalHashCode. This is our most conventional and confident conclusion. But is that really the case?

  3. validation

  1. Verification results

    The results tell us, not in the way we expected. The result above is the actual number generated by the ThreadLocal source code. But why is this so?

    When our JVM actively uses the ThreadLocal class, by executing new ThreadLocal(), the JVM loads the ThreadLocal class, links it, and initializes it. The ThreadLocal class is loaded into the JDK1.8 metadata space. When initialized, the num attribute threadLocalHashCode is set to 0, and the AtomicInteger value in nextHashCode is set to 1640531527.

    When we execute new ThreadLocal() the second time, the JVM doesn’t need to load the ThreadLocal class because the metadata space is already there. At this point, initialization is performed. The num2 attribute threadLocalHashCode is set to 1640531527, and the AtomicInteger value in nextHashCode is set to -1013904242.

  2. The problem here is that although new ThreadLocal() is used each time, the computed value does not start at the initial value of 0. I’m going to add 1640531527 each time.

  • Is it possible to use this magic number to resolve hash collisions?

    Use the algorithm threadLocalHashCode & (Length-1). Totally doable. Because from the tests we did, it was uniformly perfect hashing. A perfect hash is 16 lengths and then produces each number in 0-15.

    It does satisfy resolving hash collisions. However, the authors design ThreadLocal to be weak references. In actual use, there may be ThreadLocal that has completed its function and is discarded by GC, and then a specific entry in entry[] will no longer occupy space. And since we know that magic numbers are circular, we’re bound to have hash collisions.

    Therefore, the author’s implementation must have a hash collision. How do you resolve hash collisions? Linear detection method. It doesn’t matter what the concept is, but we need to talk to people about it, so we still need to understand the concept.

  • So why did the author use circular arrays again?

    Because ring arrays can save memory, they can reuse space already allocated. The underlying reason here is to reduce the number of expansions. Because each expansion needs to allocate twice the original space, the content of the old entry[] is completely copied to the new entry[], which requires some calculation process and consumes performance. If you can reduce the frequency of capacity expansion, performance will be high. Average performance is of course considered here, because the concept of capacity expansion is small.

    Without the ring array, the frequency of capacity expansion would certainly increase, and the memory footprint would be higher.

    If ring array is used, not only memory space is saved, but also the number of capacity expansion is reduced.

    The author’s thinking and design are very skilled.

  • Does an array of rings increase the probability of hash collisions?

    Is.

    Suppose our GC is very timely and our magic correlation algorithm is cyclic. So collisions are bound to occur, and our storage space is circular, so using magic numbers together with circular arrays does increase the probability of hash collisions. This is a conjecture, and in science such a conjecture is reasonable as long as there is one possible opposite.

    Of course, it is also possible that the GC is not timely, so some useless entries cannot be cleared in time and occupy the space of the table. In this case, expansion may occur. Once expansion occurs, the probability of hash collisions will be reduced. The longer the length, the less likely it is to happen.

summary

  • Analysis of the relationship between magic number correlation algorithm and hash collision.

  • The magic number correlation algorithm is perfectly hashed, but the authors use a safeguard to worry about memory leaks, so ThreadLocal instances that are not in use are not GC away in time, so ThreadLocal is designed to be weak references. Solving one problem introduces another, that is, the hash of a magic number and the entry that is cleared will duplicate, so the author needs to resolve the hash collision.

  • A master is a master because he has the wisdom of weighing. He believes that the performance improvement and resource utilization benefits of space saving and reducing capacity expansion frequency outweigh the disadvantages of increasing hash collision probability. Because, no matter what, the author’s code design is bound to resolve hash collisions. That is, authors need to write the logic to resolve hash collisions regardless of whether they use circular arrays.

  • If you need to use multiple ThreaddLocal instances, you can define a reference object and let it hold the corresponding variables. Perhaps the author has a deeper consideration for this ThreadLocal. It is hoped that this design can meet a variety of usage scenarios. Remember, a university is built by a certain tree, the designer considered the future several hundred years later, may overhaul the university, so planted the corresponding trees in advance.

  • The author wanted to save memory space and reduce the frequency of expansion, so he designed the ring array. This circular array has the advantage of increasing the probability of hash collisions. This is the way things are in the world. He was born for one thing. He enjoys the benefits of his strengths, but also suffers the disadvantages of his weaknesses. If we choose to accept anyone or anything in our life, then we need to have such a mindset. Enjoy the good and tolerate the bad. It is a fact, a fact that cannot be changed. What’s the only way? Accept it. What if there are no good ones? Don’t accept it.

  • Suddenly thought of, in “Douluo Land”, super Soul Tang Hao said: “Forging fan iron, at the beginning of the forging, the force can not be too large, otherwise the iron will fracture, because the iron with a lot of impurities. At a certain point, iron becomes denser and can withstand more force, so it needs to be strengthened. Hammering iron is not about using as much force as possible, but about knowing when to use what kind of force. What does this mean? Success depends on forethought.