“This article has participated in the good article call order activity, click to see: back end, big front end double track submission, 20,000 yuan prize pool for you to challenge!”

The map itself is concurrency insecure

We all know that go maps are concurrent unsafe. Fatal Error: concurrent map writes occur when several Goruotine (s) either read or write a map at the same time

  1. At the beginning of the program we initialize a map
  2. The subgoroutine is assigned to m[a]
  3. The main goroutine assigns m[a]

In theory, any time you have a multi-core CPU, if both the sub-goroutine and the main gouroutine are running at the same time, you will have a problem. Go run-race main.go

By testing, we can find that there isdata raceThat is, the data competition problem. Some people say this is simple, lock solve, lock can solve, but you know, the cost of the lock problem.

Leaving aside the data race problem, we can see the cost of locking by looking at an example:

  1. BenchmarkAddMapWithUnLockThe test is unlocked
  2. BenchmarkAddMapWithLockIt’s a test with a lock

Go test-bench. The results are as follows:

It can be found that the average lockless time is approximately6.6 ms, the average time with the lock is about7.0 ms, although similar, but also reflect the cost of locking. In some complex cases, it may be more obvious.

sync.map

Some say that since locking is expensive, use go’s built-in sync.map method, which solves the concurrency problem. Sync. map does solve the concurrent map problem, but it is suitable for the case of more read and less write. It can guarantee concurrent security without the cost of locking. In the case of more write and less read, it may be worse, mainly because of its design.

structure

  1. Mutex locks, which are used when dirty data operations are involved
  2. Read: no lock is required for a read. It is read from a read. Read is of type atomic.

M is also a map. Value is a pointer to an entry. An entry is a structure of the following types:

There’s just a p in there, and when we set the value of a key, we can think of p as a pointer to that value.

readOnly.amended = true“Indicates that the data read is not the latest. Dirty contains new keys that are not read.

3. Map dirty is also a Map type, so it is named dirty. In some scenarios, when you add kv, you add dirty first.

4. When the number of misses is greater than or equal to the length of the dirty, the number of dirty is assigned to the read. Reset both missed and dirty.

For example

The core idea of Sync. map is that space changes time. Let’s say there is an exhibition showing n paintings. A group of people come to see them. They can see whatever they want at the exhibition without waiting or waiting in line. At this time, I have a new painting, but because the exhibition is in working hours, I cannot directly hang it, and the new painting may need to maintain something, so I will not put it on the exhibition (read) for the time being, so I will put it in the backup warehouse (dirty) first. If someone really wants to see this new painting, I can only take him to the warehouse (dirty) to see it. Suppose a new painting comes at this time, and there are n+1 paintings in the warehouse. At this time, someone asks: Do you have this new painting? The manager says: Yes, you and I go to the warehouse to have a look. Then someone asked if they had this new painting. The manager said, Yes, you and I will go to the warehouse and have a look. When the number of requests for the new painting reached n+1, the exhibition owner realized that the new painting was still in demand. So he said to the manager, “Go and see. When no one reads the paintings in the exhibition, remove all the read paintings and replace all the dirty ones in the warehouse.” When all the managers were changed, there were already the most complete and up-to-date paintings on the read. Sync. map’s principle is similar to the above example: when a few people are interested in new paintings (new K, V), they take them to the dirty warehouse to see them. At this time, there is only one manager, so only one person can take them at a time (lock), which is inefficient. Other paintings can be read at random, which is efficient.

Store (add or update a kV)

  1. When the key has a read, then it is an update value. An attempt is made to update the value directly. A successful update is returned without locking. Here’s a tryStore:

TryStore has judgmentp == expungedReturns false. There are three types of P:nilWhen the key in read is deleted, it is soft deleted, only p is set to nil.expunged(deleted key (p==nil) is set to expunged when read copy goes to dirty),Addresses of other normal valuesIf expunged is used, then update value is not selected.

  1. Lock, and everything else is thread safe.
  2. In the process of locking, there may be a key that does not exist originally, so check again. If there is a key in read, and it was originally deleted by the dirty, then restore the key in the dirty, and finally set the value.
  3. If there is no key in read, but there is in dirty, then change value directly
  4. If neither read nor dirty has the key and the dirty is nil, try to copy the undeleted key from the read to the dirty. (Deleting read is not really deleting, it will set entry.p to nil.) This is done in dirtyLocked:

Then set the new K, V in dirty. The new k and V are added to the dirty map first, but the read is not.

Set amended=true when dirty is relatively clean data (it has cleared nil or expunged key) (it indicates that the dirty is not empty and that there is new data in it)

7. To unlock

Conclusion:

  1. You can see that for updates, both read and dirty are updated because value is a pointer and the underlying is a value
  2. New items are added first to dirty and not to read
  3. If you want to add a new key, you need to lock it. If you want to add a new key, you need to lock it every timeif elseBranch judgment of. The overall locking performance is definitely worse than that of regular Map.

Load (get a kV)

  1. When this key is not present in read and amenbed=true, lock (dirty is not thread-safe)
  2. Check again because the read may change during the locking process
  3. Get data from dirty
  4. = =len(dirty); = = =len(dirty); = = = = = = = = = = = = = = = = = = = = =
  5. Remake the dirty and misses.

6. Return nil if there is no corresponding key, or value if there is

Conclusion:

  1. If there is a key in the read, there is no need to lock and return directly, which is efficient and environment-friendly
  2. If dirty has a key, it reverses the read by recording the number of misses, tolerating the lock time for a miss, and ultimately reading the read for the new key.

Delete (Delete a k)

  1. When the read key does not exist and the dirty has new data, the lock is added
  2. Check again because the read may change during the locking process
  3. When there is new data in dirty, delete the K in dirty directly
  4. If read does, then soft delete, set p to nil

Summary: When the deleted key is in the read, it can be marked by soft deletion. In this way, the map corresponding to the read will not be expanded by the same amount due to frequent deletion. You can refer to the expansion rules of mapPrinciple of the map.

Back to the topic

Sync. map is good for more reads and less writes. It is certainly better than regular map locking, but it is not good for more writes and less reads because of the overhead involved in frequent locking, read and dirty exchanges, and may even be worse than regular Map locking. Let’s take an extreme example:

  1. BenchmarkAddMapWithUnLockThe test is unlocked
  2. BenchmarkAddMapWithLockIt’s a test with a lock
  3. BenchmarkAddMapWithSyncMapIs to test the sync. The map

All three methods add 10W pieces of data to a map.

Go test-bench. The results are as follows:

Sync. map takes about five times longer than the other two. Sync. map is a good thing, but the wrong scenario can backfire.