background
In a recent code review, SOME students added the volatile keyword to the variables carrying cached data. In the previous project, some students used to volatile the variables of configuration data obtained from the configuration center. Today we will discuss whether volatile is necessary.
The role of the volatile keyword
1> Prevent instruction reordering
2> Disable working memory buffers and use main memory directly.
Classic usage scenarios
Scenario 1
Public static Singleton getInstance() {if (instance == null) {synchronized (Singleton. Class) {//1 If (instance == null) {//2 instance = new Singleton(); //3 } } } return instance; }Copy the code
If volatile is not used, it may fail because the memory model allows so-called “out-of-order writes.” A thread may acquire an instance that is not fully initialized.
Scenario 2
private volatile int value; Public int getValue() {return value; } // Write operations must be synchronized. Public synchronized int increment() {return value++; }Copy the code
This code, can achieve a thread – safe counter. Because the valatile keyword is added. Each thread can add or subtract the latest value.
Pitfalls to avoid
The abuse of volatile was seen during code reviews. In my opinion, volatile keyword is not recommended for situations where changes are rare and time sensitive.
Take an example: fetching configuration data from the company’s configuration center. Volatile is not recommended.
A typical configuration center architecture looks like this
A piece of data is changed from the user to the configuration center for centralized storage, and the configuration center sends the data to the real machine. Before, the company needed to go to the configuration center for the latest data through the 90s(the client has a period of 90s).
Adding the volatile keyword in this scenario just makes it easier to see the latest value. Now let’s test how long this is faster.
public class VolatileTest { private boolean endRun = false; @Test public void noVolatile() throws Exception { Runnable r1 = new Runnable() { public void run() { int i = 0; while (! endRun) { System.out.println("I am still running" + i++); }}}; Runnable r2 = new Runnable() { public void run() { endRun = true; }}; new Thread(r1).start(); new Thread(r2).start(); Thread.sleep(9000); System.out.println("end run"); }}Copy the code
In this code, the endRun variable is used in the first thread. The second thread changes the value of the endRun variable. As soon as the first thread sees the change in the second thread’s value, it stops the loop.
The running results are as follows:
Indicates that after two cycles, the thread reads the value changed by another thread. According to the following time delay table, let’s calculate:
It takes an average of five instructions to execute a simple line of code. It takes 1ns to execute an instruction as above. Each loop executes two lines of code, and the result is two times. A total of 512 * 2 = 20 ns. The actual data should be different and varied. But it should all be NS grade.
Compared with the visibility delay of 90s, the NS level can be ignored.
Consider the cost of seeing results at the early NS level: the volatile keyword essentially invalidates CPU caches such as L1 cache and L2 cache, allowing direct main memory access. If the field to be accessed is in the L1 cache, the data fetched from the configuration center changes once a day. Take the example of fields in L2 cache. Adding volatile increases the access time from 4ns to 100ns. If this variable is accessed on every request, the QPS per second is 1000. It will take an extra 12436001000(100-4)ns to get this data in 1 day, which is about 8300ms.
The costs are orders of magnitude greater than the gains. But the time cost is small, and to be honest, an extra 8.3s per day is acceptable for a typical system. However, there are many such variables, which is also a small burden. And this burden will increase as the system pressure increases.
Timing cache loading into memory works the same way. Volatile is not recommended.
Other extended thoughts
The conclusion above is that volatile is not recommended in situations where time sensitivity is not high, but that it may have a good impact on the system in general. The worst thing to do is to make an optimization that doesn’t really mean much, but introduces systemic risk.
Last week, I reviewed the code of my classmate, who is a student with technical pursuit. Likes to make small optimizations while writing code. It’s a good habit. But people who review code are demanding. Because common business logic affects that area. But optimizations can affect other parts.
This time, he added “update cache success, affecting XX pieces of data” in several places in the log that originally printed “update cache success”. This XX takes the.size() method of the Guava Cache.
In this code, I take a closer look at the initialization method of Guava cache. The initialization method is very complex and contains several assertions (which cannot be null). This initialization method does not have a uniform try catch to catch an exception once it is thrown somewhere. There may be incomplete instantiation. I put this idea out there. He proved to me by walking through the code that there really is no null pointer. I agree that he is right. But the whole process speaks to my caution about the matter. If we put a function online and there is a problem with the function, the new function will go through the process of gray scale and slowly loading during the observation period, and the impact will not be much. But I changed the other parts, especially the parts that I felt would never be a problem. If something goes wrong, your team’s trust score will drop. Making changes later requires repeated proof of impact and is very passive.
Source: www.tuicool.com/articles/Nn…