General documentation: Article directory Github: github.com/black-ant
I. Volatile basis
> volatileEnsure visibility of memory and prohibit reordering of instructions >volatileA happens-before guarantee ensures that changes made by one thread are visible to other threads > guarantees thread visibility and provides some order// Java can create volatile arrays, but only as a reference to the array, not as a whole array.
Copy the code
2. Volatile in-depth knowledge points
> Reading and writing data from main memory is not as fast as executing instructions in the CPU. To improve efficiency, use the CPU cache to improve efficiency. > CPU cache: THE CPU cache is unique to a CPU and depends only on the threads running on the CPU/ / principle @ https://www.cnblogs.com/xrq730/p/7048693.html
Step 1: The L1 Cache is located next to the core of the CPU and is the most closely integrated with the CPU. L2 Cache is divided into internal and external chips. The speed of the level-2 Cache on the internal chip is the same as the main frequency, while the speed of the level-2 Cache on the external chip is only half of the main frequency. Level-3 Cache, L3 Cache for short, is available only on some high-end cpus// Cache loading order
1> The program and data are loaded into main memory2Instructions and data are loaded into the CPU cache3> The CPU executes the instruction and writes the results to the cache4Data in the cache is written back to main memory// Step End: because of different caches, the data is inconsistent, so the rule appearsWhen one CPU modifies a byte in the cache, the other cpus on the server are notified and their caches are deemed invalidCopy the code
B. Synchronized C. volatile D. synchronized
- Volatile essentially tells the JVM that the value of the current variable in the register (working memory) is uncertain and needs to be read from main memory. Synchronized locks the current variable so that only the current thread can access it, blocking all other threads.
- Volatile can only be used at the variable level. Synchronized can be used at the variable, method, and class levels.
- Volatile only provides visibility into changes to variables, not atomicity. Synchronized guarantees visibility and atomicity of changes to variables.
- Volatile does not block threads. Synchronized may cause a thread to block.
- Variables marked with volatile are not optimized by the compiler. Variables with the synchronized tag can be optimized by the compiler.
Note: volatile does not replace synchronized
4. Principles of volatile
A look at the assembly code generated with and without the volatile keyword shows that with volatile, there is an extra lock prefix instruction. The lock prefix directive acts as a memory barrier. A memory barrier is a set of processing instructions that implement sequential restrictions on memory operations. The underlying implementation of volatile is through memory barriers
-
Lock addl $0x0, (% RSP)
-
Use the lock prefix before a write operation, which locks the bus and its corresponding address. Other writes and reads must wait until the lock is released.
-
Step 3: When the write is complete, release the lock and flush the cache into main memory.
-
Reading volatile is easy to understand. No additional assembly instructions are required. The CPU finds that the address of the cache is locked, and until the lock is released, the cache consistency protocol ensures that it reads the latest value.
-
All you need to do is lock the bus with the use of volatile writes, so that other reads and writes wait until the bus is released. Locking causes other cpus to cache invalide and reload data from memory.
// Memory semantics for volatile-When you write avolatileVariable, JMM will correspond to the thread in the local memory shared variable value, immediately refresh to the main memory. - When reading avolatileVariable, the JMM sets the thread's local memory to invalid and reads the shared variable directly from main memoryvolatileThe write memory semantics are directly flushed into main memory, and the read memory semantics are directly read from main memory.// How the memory semantics of volatile are implemented: To implement the memory semantics of volatile, the JMM limits reordering
1.If the first operation is zerovolatileRead, no matter what the second operation is, it cannot be reordered. ? - This operation ensuresvolatileOperations after a read are not reordered by the compilervolatileRead before;2.If the second operation is zerovolatileWrite, no matter what the first operation is, you can't reorder. ? - This operation ensuresvolatileOperations before the write are not reordered by the compilervolatileAfter writing;3.When the first operationvolatileWrite. The second operation isvolatileWhen reading, it cannot be reordered.// The underlying implementation of volatile: memory barriers. With memory barriers, reordering can be avoided-> It is almost impossible for the compiler to find an optimal placement to minimize the total number of memory barriers inserted, so the JMM adopts a conservative strategy • at eachvolatileBefore the write operation, insert a StoreStore barrier. - StoreStore barrier: ensure that thevolatileBefore the write, all the preceding normal writes have been flushed to main memory. • In eachvolatileAfter the write operation, insert a StoreLoad barrier - StoreLoad barrier: avoidvolatileWrite, and what might followvolatileReorder read/write operations. • In eachvolatileAfter the read operation, insert a LoadLoad barrier - LoadLoad barrier: prevents the processor from putting the abovevolatileRead, reordered with the normal reads below. • In eachvolatileAfter the read operation, insert a LoadStore barrier - LoadStore barrier: prevents the processor from putting the abovevolatileRead, reordered with the normal write below.Copy the code
Volatile atomicity
Word-wrap: break-word! Important;"volatileVariables and atomic variables// Volatile is not a good guarantee of atomicity
volatileVariables ensure antecedence, that a write occurs before a subsequent read, but it does not guarantee atomicity. For example, usevolatileModifies the count variable, so count++ is not atomic. The AtomicInteger class provides atomic methods that make this operation atomic. The #getAndIncrement() method, for example, atomically increments the current value by one. Similar operations can be performed for other data types and reference variables.Copy the code
6. Volatile source code
TODO: when it comes to the source code, leave the pit first, you can see the @https first://www.cnblogs.com/xrq730/p/7048693.html
// Main nodes:
0x0000000002931351: lock add dword ptr [rsp],0h ;
*putstatic instance;
- org.xrq.test.design.singleton.LazySingleton::getInstance@13 (line 14) > put the double byte stack pointer register +0To ensurevolatileMemory visibility of keywords// basic concept 1: LOCK# function- Lock the bus - All read and write requests to memory from other cpus are blocked until the lock is released - although later processors use the lock cache instead of the lock bus - because it is expensive to lock the bus and other cpus cannot access the memory while the bus is locked - the lock will write back the modified data. It also inactivates other CPU related cache lines to reload the latest data from main memory - not a memory barrier but a memory barrier that prevents the order from being reordered twice// Basic concept 2: cache lines- When the CPU sees an instruction read from memory, it passes the memory address to the level 1 data cache. - The level 1 data cache detects if the segment is loaded// Cause: Volatile is implemented based on cache consistencyStep2: Multiple groups of caches are used for cache consistency, but they only behave like one set of caches. Step3: Common protocols are snooping and MESI. Snooping is used to mediate all memory access operationsCopy the code
7. Volatile measurement
/ / test atomicity, ThreadC results: -- -- -- -- -- - > count: 9823 < -- -- -- -- -- -- --
// In Thread
public static void addCount(a) {
for (int i = 0; i < 100; i++) {
count++;
}
logger.info("------> count :{} <-------", count);
}
ThreadC[] threadCS = new ThreadC[100];
for (int i = 0; i < 100; i++) {
threadCS[i] = new ThreadC();
}
for (int i = 0; i < 100; i++) {
threadCS[i].start();
}
Synchronized -- > ThreadD count :10000 <-------
synchronized public static void addCount(a)
Copy the code