Click “like” to see, form a habit, wechat search [Three prince Aobing] follow this fool who seems to have something
This article has been included in GitHub github.com/JavaFamily, there are a line of big factory interview complete test points, materials, resume template, and my life program.
preface
Generating Random numbers in code is a very common feature, and the JDK provides a ready-made Random class to implement it, and the Random class is thread safe.
Here is an implementation of random.next () that generates a Random integer:
protected int next(int bits) {
long oldseed, nextseed;
AtomicLong seed = this.seed;
do {
oldseed = seed.get();
nextseed = (oldseed * multiplier + addend) & mask;
//CAS is inefficient with competition
} while(! seed.compareAndSet(oldseed, nextseed));return (int)(nextseed >>> (48 - bits));
}
Copy the code
As you can see, the above method uses a CAS operation to update the seed, and in a scenario where a large number of threads are competing, the CAS operation is likely to fail and retry, which in turn consumes CPU computation, resulting in a significant performance degradation.
Thus, while Random is thread-safe, it is not “highly concurrent.”
To address this problem, and enhance the performance of random number generators in high-concurrency environments, ThreadLocalRandom, a high-concurrency random number generator, is introduced.
ThreadLocalRandom inherits from Random. According to the Richter substitution principle, this indicates that ThreadLocalRandom provides the same Random number generation function as Random, but the implementation algorithm is slightly different.
Variables in Thread
To combat thread contention, Java has a ThreadLocal class that allocates a separate, unrelated storage space to each thread.
The ThreadLocal ThreadLocal implementation depends on the Thread object. ThreadLocalMap threadLocals member fields.
Similarly, in order for the random number generator to access only local Thread data and avoid contention, three additional members are added to Thread:
/** The current seed for a ThreadLocalRandom */
@sun.misc.Contended("tlr")
long threadLocalRandomSeed;
/** Probe hash value; nonzero if threadLocalRandomSeed initialized */
@sun.misc.Contended("tlr")
int threadLocalRandomProbe;
/** Secondary seed isolated from public ThreadLocalRandom sequence */
@sun.misc.Contended("tlr")
int threadLocalRandomSecondarySeed;
Copy the code
As members of the Thread class, these three fields are bound to each Thread object and become ThreadLocal variables. The random number generator that relies on these variables is called ThreadLocalRandom.
Eliminating pseudo sharing
I don’t know if you’ve noticed that there’s an annotation @sun.misc.Contended on top of each of these variables. What does that annotation do? To understand this, you need to know about an important problem in concurrent programming, pseudo sharing:
As we know, the CPU does not directly access memory, data is loaded into registers from the cache, and the cache has L1, L2, L3 and other levels. Let’s simplify the hierarchy of responsibilities here, assuming only a level 1 cache and one main memory.
The CPU reads and updates the cache in a behavior unit, also called a cache line, which is 64 bytes long.
Therefore, the problem is that a cache line can hold multiple variables. What happens if multiple threads access different variables at the same time and those different variables are located on the same cache line?
As shown in the figure above, X and Y are adjacent variables on the same cache row. Core1 and Core2 both load them, core1 updates X, and Core2 updates Y. Since data is read and updated in units of cache behavior, this means that when these two things happen at the same time, there is a race. This leads to the possibility that Core1 and Core2 will need to refresh their own data (cache rows have been updated by each other), which leads to a significant reduction in system performance. This is a pseudo-sharing problem.
So how can we improve it? The diagram below:
In the figure above, we use a separate cache row for X and a separate cache row for Y, so that both updates and reads are not affected.
The @sun.misc.Contended(” TLR “) code in the above code will help us generate some padding before and after variables at the virtual machine level, so that the labeled variables are on the same cache line and don’t conflict with other variables.
In the Thread objects, member variables threadLocalRandomSeed, threadLocalRandomProbe, threadLocalRandomSecondarySeed TLR is marked as the same group, Placing these three variables on a single cache line without conflicting with other variables improves access speed in a concurrent environment.
Efficient alternative to reflection
Random number generation requires access to threadLocalRandomSeed and other members of the Thread, but these members are visible within the package due to class encapsulation.
Unfortunately, ThreadLocalRandom is in the java.util.concurrent package and Thread is in the java.lang package, so, ThreadLocalRandom does not have access to variables such as threadLocalRandomSeed of a Thread.
At this point, Java veterans might jump in and say, “What is this? Look at my reflection method, I can dig out anything.”
True, reflection is a way to bypass encapsulation and access the data inside an object, but reflection doesn’t perform well enough to be a high-performance solution.
Is there a way for ThreadLocalRandom to access the internal members of a Thread while still having methods that are far more reflective and infinitely close to direct variable access? The answer is yes, which is to use the Unsafe class.
Here are the two Unsafe methods used:
public native long getLong(Object o, long offset);
public native void putLong(Object o, long offset, long x);
Copy the code
The getLong() method reads a long from o’s offset byte. PutLong () writes x to the offset of the first byte of o.
This type of c-like operation provides significant performance gains and, more importantly, can easily circumvent member visibility restrictions by bypassing field names and using offsets directly.
Addressing broadening concerns, how do I know the offset position of the threadLocalRandomSeed member in the Thread? This requires the unsafe objectFieldOffset() method, as shown in the following code:
When ThreadLocalRandom is initialized, the static Thread member variables threadLocalRandomSeed, threadLocalRandomProbe, ThreadLocalRandomSecondarySeed offset the position of the object.
Therefore, whenever ThreadLocalRandom needs to use these variables, it can be accessed via the unsafe getLong() and putLong() (and possibly getInt() and putInt()).
For example, when generating a random number:
protected int next(int bits) {
return (int)(mix64(nextSeed()) >>> (64 - bits));
}
final long nextSeed(a) {
Thread t; long r; // read and update per-thread seed
// In ThreadLocalRandom, the threadLocalRandomSeed variable of Thread is accessed
UNSAFE.putLong(t = Thread.currentThread(), SEED,
r = UNSAFE.getLong(t, SEED) + GAMMA);
return r;
}
Copy the code
To examine just how quickly the Unsafe method can fall, let’s take a look at it:
Addressing broadening concerns, we’ll write our own ThreadTest class, which uses both the reflection and unsafe methods to read and write threadLocalRandomSeed variables and compare their performance:
Unsafe() to read and write threadLocalRandomSeed 100 million times:
byUnsafe spend :171ms
byReflection spend :645ms
Copy the code
It’s not hard to see why the Unsafe method is so much better than the reflection method, which is one of the reasons why Unsafe has been so widely used in the JDK as an alternative to reflection.
Random number seed
We know that generate pseudorandom number needs to be a seed, threadLocalRandomSeed and threadLocalRandomSecondarySeed is the seed of here. The threadLocalRandomSeed is long, threadLocalRandomSecondarySeed is int.
ThreadLocalRandomSeed is the most widely used random number based on threadLocalRandomSeed. And threadLocalRandomSecondarySeed just certain JDK there are used in the internal implementation, is not widely used.
The initial seed uses system time by default:
The initialization of the SEED is done in the above code, and the initialized SEED is stored in the location of the SEED via UNSAFE (i.e., threadLocalRandomSeed).
We can then use the nextInt() method to get random integers:
public int nextInt(a) {
return mix32(nextSeed());
}
final long nextSeed(a) {
Thread t; long r; // read and update per-thread seed
UNSAFE.putLong(t = Thread.currentThread(), SEED,
r = UNSAFE.getLong(t, SEED) + GAMMA);
return r;
}
Copy the code
Each call to nextInt() updates threadLocalRandomSeed with nextSeed(). Since this is a thread-specific variable, there are no races at all, no CAS retries, and performance is greatly improved.
Function of Probe
In addition to the seed, there is a threadLocalRandomProbe variable. What does this variable do?
We can think of threadLocalRandomProbe as a Hash value (not zero) for each Thread that can be used as an eigenvalue for a Thread based on which the Thread can find a specific location in the array.
static final int getProbe(a) {
return UNSAFE.getInt(Thread.currentThread(), PROBE);
}
Copy the code
Take a look at a code snippet:
CounterCell[] as; long b, s;
if((as = counterCells) ! =null| |! U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)) {
CounterCell a; long v; int m;
boolean uncontended = true;
if (as == null || (m = as.length - 1) < 0 ||
Use probe to find a position in array AS for each thread
// Since each thread has a different probe value, there is a high probability that the elements in the array corresponding to each thread are also different
// Each thread corresponds to a different element, so that full concurrent operations can be performed without conflict
// Therefore, the probe here plays a role in preventing conflicts
(a = as[ThreadLocalRandom.getProbe() & m]) == null| |! (uncontended = U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) {Copy the code
In the concrete implementation, if a clash between the above code, so, you can also use ThreadLocalRandom. AdvanceProbe () method to modify a thread probe value, so that we can further to avoid possible future conflicts, thereby reducing competition, improve the concurrency performance.
static final int advanceProbe(int probe) {
// Calculate an updated probe value based on the current probe value
probe ^= probe << 13; // xorshift
probe ^= probe >>> 17;
probe ^= probe << 5;
// Updating the probe value to the thread object modifies the threadLocalRandomProbe variable
UNSAFE.putInt(Thread.currentThread(), PROBE, probe);
return probe;
}
Copy the code
conclusion
Today, we introduced the ThreadLocalRandom object, a high-performance random number generator in a high-concurrency environment.
Not only did we introduce the functionality and internal implementation principles of ThreadLocalRandom, we also introduced how ThreadLocalRandom objects achieve high performance (e.g. through pseudo-sharing, Unsafe, etc.), we hope you can apply these techniques to your projects.
Do the little fools have a deeper understanding of this underdog? Take a wave in the comments section: Get Stronger
I’m Aobing, and the more you know, the more you don’t know, and I’ll see you next time.
This article is constantly updated. You can search “Santaizi Aobing” on wechat and read it for the first time. Reply [Information] There are the interview materials and resume templates for first-line big factories prepared by me.