This is the 26th day of my participation in Gwen Challenge. Following on from the previous post hello, what about the volatile keyword? (2)

4.4 Cache Consistency ProtocolMESI

The MESI protocol has four states: Modified, Exclusive, Shared, and Invalid. The four states are represented by two bits.

  • ModifiedsaidCache LineValid, data exists only in the currentCache, and the data has been modified, and is inconsistent with the main memory.
  • ExclusivesaidCache LineValid, data exists only in the currentCacheIn, data is consistent with main memory.
  • SharedsaidCache LineValid. Data doesn’t just exist in the currentCacheIn, by manyCacheShare, eachCacheConsistent with the main memory data.
  • InvalidIndicates that the current cache row is invalid.

In the MESI protocol, each Cache controller not only knows its own read and write operations, but also listens for the read and write operations of other caches. This is called sniffing technology, and the controller performs different listening tasks for the current state of the Cache Line.

4.4.1 Status change process

Let’s use a simple example to analyze how tasks are monitored. Suppose we now have a dual-core CPU, and the main memory contains an I variable of 1. Now the CPU needs to do some operations and read I into the cache.

Step 1: In the figure, CPU1 reads data from main memory into the cache. The variable I in the current cache is 1, and the state of the cache line E is exclusive. It is always listening for other caches to load the variable from main memory.

Step 2: In the figure, CPU2 also attempts to read variable I from main memory and load it into the cache. CPU1 listens for this event and changes its state to S. CPU2 also reads the data and changes its state to S. Both CPU Cache lines are listening for events that require the Cache to invalidate itself with I, or for other requests that the Cache should have its own variable.

Step 3: After CPU1 completes the calculation, the Cache manager changes the state of the Cache Line to M and sends an event notification to other cpus. CPU2 receives the event notification and changes the state of the Cache Line to I invalid. CPU1 listens for other caches to read main memory events. CPU2’s cache row is invalid because it is invalid in state.

Step 4: On the diagram, CPU2 uses variable I because the cache line that stored I is invalidated and the main memory is actively synchronized. CPU1 receives a request from another CPU to read main memory, so it synchronizes the changed variable to main memory before reading it. After the synchronization, the variable I =2 on main memory, and the CACHE manager sets the state of the cache row to E. Then follow step 4 to change the final state of the Cache lines of both cpus to S.

4.4.2 Principle of State change

In general, the MESI protocol follows the following principles for CPU reads and writes:

  • CPURead request: The current state of the cache row isM E SAll states can be read, inIState,CPUData can only be read from main memory.
  • CPUWrite request: The current state of the cache row isM EStates can be written directly, inIIn state, the cache line is invalid and cannot be read. In aSState that can be written only if other cache rows are set to invalid.
4.4.3 MESIProblems brought about

Although the four states and sniffing techniques of the MESI protocol achieve cache consistency, they also bring some problems.

We talked about above, if the CPU to write to Cache the results of calculation after Line, need to send a notice to the other store the same failure data of the CPU, and must wait until their status changes after the completion of the write operation is performed, throughout the period, the CPU in the synchronous blocking of waiting, very affect the performance of the CPU.

In order to solve the problem of blocking waiting, the Store Buffer was introduced into the CPU. Through this Buffer, when the CPU wants to modify the value in the cache, it only needs to write the data into this Buffer, and then it can execute other instructions. The buffer is then stored in the Cache Line and then synchronized to main memory if necessary when another CPU changes the state of the specified Cache Line to invalid I.

This scheme is asynchronous and solves the problem of CPU synchronous wait blocking. But it also introduces new problems.

  • Since it is an asynchronous operation, when exactly will the others be receivedCPUNotification of status changes is not explicit, so causesStore BufferWhen is the data written toCache LineIt’s also uncertain.
  • When nothing else is receivedCPUBefore state changeCPUIt’s possible to read data, first fromStore BufferMedium read, if not, read againCache LineIf not, read main memory again.

The new problem, with a huge impact, is instruction reordering.

Let’s use an example to see what the problem is.

int value =1; bool finish = false; void runOnCPU1(){ value = 2; finish = true; } void runOnCPU2(){ if(finish){ assert value == 2; }}Copy the code

Let’s assume that the #runOnCPU1 and #runOnCPU2 methods are running on two separate cpus. It is easy to assume that there will be no assertion execution. When it does, here’s one possible scenario.

Two key variables are cached on the CPU1 cache line with the following status:

value finish
CacheLinestate S E

When executing the #runOnCPU1 method, CPU1 writes value=2 to the Store Buffer, executes finish=true, and notifies other cpus that Store the same variable to set the state of the cache to I invalid, and asynchronously waits for the result to be returned.

Because the current Cache Line where the finish variable is stored is E exclusive, finish=true can be written to the Cache Line immediately without notifying other cpus. Finish =true; the Cache Line status of Finish is set to S, and the Cache Line status is set to S. And finish=true for main memory. CPU2 continues to assert Value == 2; This instruction will first fetch the value of value from main memory. Since CPU1’s modified value is stored in the Store Buffer, CPU2 will fetch the value of 1.

In other words, what we can see is that in the #runOnCPU1 method, finish is assigned earlier than value, which is different from what we expected. This is the visibility problem caused by instruction reordering.

This visibility problem can be solved using the JMM memory barrier, and it is precisely this that volatile is the killer for ensuring visibility in multithreaded environments. Hello, what about the keyword volatile? (4)


Brother boy, don’t panic to go! Leave a thumbs-up and comment on the discussion. Welcome to the column face interview don’t panic | Java concurrent programming, a raise don’t have to worry about the interview. Also welcome to pay attention to me, must be a longer better man.