CPU cache architecture

1. Every time data is read, it will start from L1. If L1 cannot find it, it will go to L2; if L2 cannot find it, it will go to L3; if L3 cannot find it, it will go to main memory to read 2.2. Capacity: L3>L2>L1 Speed: L1>L2>L3

3. Read the minimum cache line at a time: 64 bytes. It can be understood that the data in the cache is row by row and read and write one line at a time

4. Cache consistency: MESI protocol

Volatile role

1. Prevent command reordering

2. Each change to a variable writes the CPU core cache back to main memory

3. If one CPU core writes back to main memory, the cache of other CPU cores will be invalidated and read from main memory by other CPU cores (MESI protocol).

Volatile test

1. Define a class

public static class VolatileBean {
        volatile long a;
        volatile long b;
}
Copy the code

2. Enable two threads to perform operations on A and B respectively

VolatileBean volatileBean = new VolatileBean();
Thread t1 = new Thread(() -> {
    for (int i = 0; i < 10 _000_000; i++) { volatileBean.a = i; }}); Thread t2 =new Thread(() -> {
    for (int i = 0; i < 10 _000_000; i++) { volatileBean.b = i; }});long start = System.currentTimeMillis();
t1.start();
t2.start();
t1.join();
t2.join();
System.out.println(System.currentTimeMillis()-start);

Copy the code

Execute the output multiple times: between 40 and 60, maybe 17

3. Add more fields to the VolatileBean

public static class VolatileBean {
        volatile long a;
        long p1;
        long p2;
        long p3;
        long p4;
        long p5;
        long p6;
        long p7;
        volatile long b;
}
Copy the code

Then execute the code from Step 2 above several times to execute the output: between approximately 14 and 18

Analysis of the

Why does the VolatileBean add a few fields and get faster?

  • As mentioned above, the CPU cache line is 64 bytes. Our code has 8 bytes for long. When we started with only a and B, we had 16 bytes in the same cache line
  • After adding 7 long types from P1 to P7, there are 8 long types from A to p7, accounting for 64 bytes, which fill the cache line. Therefore, b can only be placed in another cache line, ensuring that A and B are not in the same cache line

  • As mentioned above, volatile fields can be written back to main memory and cause other CPU cores’ caches to invalidate the value of the field (64 bytes minimum), so
    1. A and B are in the same cache line. If A is changed, the cache line where A is in the other CPU core is invalid, that is, the value of B is also invalid, and the main memory needs to be read again
    2. A and B are not in the same cache line, so a change to a or B does not affect the invalidation of the other cache line, i.e. the next time another thread reads it, it simply reads it from the cache
thinking

As mentioned above, when there are only two threads A and B, there is also a chance that the time will be 17. I think it is possible that the two threads are running on the same CPU core and have not been verified