preface

In today’s technical group, we have a discussion about concurrency security. In fact, we have written about Golang’s map in the past. It doesn’t elaborate on what sync.map is. In retrospect, there are only “two maps, one read and one read, XXXXX” in my mind. Yes, I can, but it’s a bit confusing. Let’s write it down briefly.

1. Why sync.map? 2. How to use sync.map? Sync.map source code implementation? 4. Pros and cons of Sync.map? 5. Diffusion?

The body of the

1. Why sync.map?

Golang’s map can be directly viewed from the beginning to the end.

Why do we need it? The reason is simple: the map is empty in concurrency, read-only is thread safe, and writer-thread is not safe, so for concurrency safety and efficiency, the official implementation of a map.

1.1 What Are the Problems of Concurrent Map Writing?

Take a look at how a Map without sync.map can achieve concurrency safety:

func main() {
	m := map[int]int {1:1}
	go do(m)
	go do(m)

	time.Sleep(1*time.Second)
	fmt.Println(m)
}

func do (m map[int]int) {
	i := 0
	for i < 10000 {
		m[1]=1
		i++
	}
}
Copy the code

Output:

fatal error: concurrent map writes
Copy the code

Oh, no. This guy can’t write at the same time.

1.2 Low version solution

Add a big lock.

Var s sync.RWMutex funcmain() {
	m := map[int]int {1:1}
	go do(m)
	go do(m)

	time.Sleep(1*time.Second)
	fmt.Println(m)
}

func do (m map[int]int) {
	i := 0
	forI < 10000 {/ / locking s.L ock [1] () m = 1 / / unlock s.U nlock () i++}}Copy the code

Output:

map[1:1]
Copy the code

This is finally normal, but what could possibly go wrong? Increasing the lock probability is not the optimal solution, generally will be efficient problem. Increasing the lock affects the operation of other elements.

Solution: reduce lock time. Method: 1. Space for time. 2. Reduce your reach.Copy the code

Sync.map uses this idea. Keep reading.

2. How to use sync.map?

The code:

func mainM := sync.map {} m.tore (1,1) godo(m)
	go do(m)

	time.Sleep(1*time.Second)
	fmt.Println(m.Load(1))
}

func do (m sync.Map) {
	i := 0
	forI < 10000 {m.tore (1,1) I ++}}Copy the code

Output:

1 true
Copy the code

Running ok. This is a show.

Sync.map source code implementation?

Let’s start with the general logic in plain English. Let’s make this a little bit faster. Write: Write directly. Read: Read read first, not read dirty.

From the “infrastructure + add, delete, change and check” ideas to go through the source code in detail.

3.1 Infrastructure

The core data structure of Sync.map:

type Map struct {
	mu Mutex
	read atomic.Value // readOnly
	dirty map[interface{}]*entry
	misses int
}
Copy the code

instructions	type	role
mu	Mutex	Locking action. Protect the dirty fields in the following files
read	atomic.Value	Data stored and read. Because it is of type atomic.Value and read-only, concurrency is safe. The actual data structure is readOnly.
misses	int	Counting. Each time a read from read fails, the count is +1.
dirty	map[interface{}]*entry	Contains the latest written data. When the number of misses reaches a certain number, it is assigned to read.

It is necessary to describe briefly the general logic,

ReadOnly data structure:

type readOnly struct {
    m  map[interface{}]*entry
    amended bool 
}
Copy the code

instructions	type	role
m	map[interface{}]*entry	Simple MAP structure
amended	bool	True if the data in map. dirty is different from the data in m

Entry data structure:

typeEntry struct {// Visible value is a pointer type, althoughreadAnd dirty exist redundancy =false// *interface{}} // unsafe. int, unsafe. int, unsafe. int, unsafe. int, unsafe. int, unsafe. intCopy the code

The main purpose of this structure is to illustrate. Read and dirty are redundant, but since value is a pointer, the storage space does not increase much.

3.2 the query

Func (m *Map) Load(key Interface {}) (value interface{}, OK bool) {// causereadRead-only, thread safe, priority readread, _ := m.read.Load().(readOnly) e, ok := read.m[key] // IfreadNo, and dirty has new data, so look it up in dirtyif! Ok && read. Amended {m.m.u.lock () // Double check (for the reasons of the previous chapter)ifJudge and lock non-atomic, afraid of a story)read, _ = m.read.Load().(readOnly) e, ok = read. M [key] // IfreadThere is still no data in dirty and new data in dirtyif! Ok && Read. Amended {e, OK = M. count [key] // M. count +1 M. issLocked()} M. count ()}if! ok {return nil, false
    }
    return e.load()
}

func (m *Map) missLocked() {
    m.misses++
    if m.misses < len(m.dirty) {
        return} // set dirty toread, because the penetration probability is too high (atomic operation takes very little time).readOnly{m: m.dirty})
    m.dirty = nil
    m.misses = 0
}
Copy the code

Flow chart:

How do I set the threshold?

The threshold is set by comparing the miss count and the dirty length.

Why can dirty be changed to read?

Because a write only operates on dirty, it ensures that dirty is up to date and that the data set must contain read. If dirty is set to nil, why does it include dirty? More on that later.)

Why is dirty set to nil?

I’m not sure why. On the one hand, when a read is completely equal to dirty, it means that it has no read, even if it penetrates, so it is useless. On the other hand, when saving, if there are too many elements, the insertion efficiency will be affected.

3.3 delete

Func (m *Map) Delete(key interface{}) {// ReadreadThat assertion isreadOnly typeread, _ := m.read.Load().(readOnly) e, ok := read.m[key] // IfreadIf there are new elements in dirty, go to Dirty. It's called "amended" whenreadDirty is different from dirtytruedirtyreadThere is no data.if! Ok && read. Amended {m.m u.L ock () / / check it again, because of the above judgment and lock is not atomic operations, prevent during the change.read, _ = m.read.Load().(readOnly)
        e, ok = read.m[key]
        
        if! Ok && read. Amended {// To delete delete(m.dirty, key)} m.mu.nlock ()}ifOk {// ifreadIf the key is present in, assign the value to nil (delete by token!). e.delete() } } func (e *entry) delete() (hadValue bool) {forP := atomic.LoadPointer(&e.p)if p == nil || p == expunged {
            return false} // Atomic operationif atomic.CompareAndSwapPointer(&e.p, p, nil) {
            return true}}}Copy the code

Flow chart:

Here are a few points to emphasize:

1. Why is dirty a direct deletion and read a marked deletion?

The use of read is to precede dirty, marking the same element in order not to penetrate the dirty. For instance, if read =false, it can be amended or amended in the dirty

2. Why can dirty be deleted directly instead of being read and then deleted?

The deletion cost is low. Read a need to find, delete also need to find, do not need to repeat the operation.

3. How is it marked?

Set the value to nil. (This is crucial)

3.4 Increase (change)

Func (m *Map) Store(key, value interface{}) {// If the key exists in m.read and is not marked for deletion, try to update it.read, _ := m.read.Load().(readOnly)
    if e, ok := read.m[key]; ok && e.tryStore(&value) {
        return} / / ifreadDoes not exist or has been marked for deletion m.u.lock ()read, _ = m.read.Load().(readOnly)
   
    if e, ok := read.m[key]; ok { // readThe key exists. // If entry is marked expunge, dirty does not have a key. You can add dirty and update entry.if e.unexpungeLocked() {// add dirty, where pointer m.dirty[key] = e} // Update value e.stall locked (&value)}else ife, ok := m.dirty[key]; Ok {// dirty the key exists, update e.stall locked (&value)}else { // readAnd dirty have no // ifreadAs with dirty, a dirty flush is triggered (because whenread[fixed] Dirty set to nil [fixed]if! Read the amended {/ / willreadTo m.dirtylocked () // amended marked asreadDifferent from Dirty, because new data will be added later. m.read.Store(readOnly{m: read.m, amended: true})} m.dot [key] = newEntry(value)} m.dot lock()readAdd undeleted data to dirty func (m *Map)dirtyLocked() {
    ifm.dirty ! = nil {return
    }
    
    read, _ := m.read.Load().(readOnly) m.dirty = make(map[interface{}]*entry, len(read-.m)) //read.forK, e := range read.m {expunged, expunged, expunged, expungedif! e.tryExpungeLocked() {m.dirty[k] = e}}} And update entries marked nil with expunge func (e * Entry) tryExpungeLocked() (isExpunged bool) {p := atomy.loadPointer (&e.p)forP == nil {// Mark expunged data that has been removed and marked nilif atomic.CompareAndSwapPointer(&e.p, nil, expunged) {
            return true
        }
        p = atomic.LoadPointer(&e.p)
    }
    returnFunc (e * Entry) tryStore(I *interface{}) bool {p := atomic.loadPointer (&e.p)if p == expunged {
        return false
    }
    for {
        if atomic.CompareAndSwapPointer(&e.p, p, unsafe.Pointer(i)) {
            return true
        }
        p = atomic.LoadPointer(&e.p)
        if p == expunged {
            return false/ /}}}readFunc (e *entry) unexpungeLocked() (wasExpunged bool) {returnatomic.CompareAndSwapPointer(&e.p, expunged, Nil)} // Update entry func (e *entry) storeLocked(I *interface{}) {atomy.storePointer (&e.p, unsafe.pointer (I))}Copy the code

Flow chart:

The distinction marked as deleted in read?

Nil indicates a normal delete operation, and dirty may not exist after a. Dirty is assigned to read, and dirty does not exist after b. Dirty must exist after initialization

Marked expunged indicates that the operation is performed during initialization of dirty, which definitely does not exist.

Possible performance issues?

When initializing dirty, it is a pointer assignment, but if read is large, it may have some impact.

4. Pros and cons of Sync.map?

First the conclusion, then the proof.

Advantage: is the official, is close son; Through read/write separation, reduce lock time to improve efficiency; Disadvantages: This is not suitable for heavy write scenarios, which will cause read map to fail to read data and further lock read, and dirty map will always be promoted to Read map, resulting in poor overall performance. Application scenario: Read a lot and write a little

Here is mainly to prove why it is suitable to read a lot and write a little. With simple map and sync.map, write and read efficiency can be achieved with concurrent security

var s sync.RWMutex
var w sync.WaitGroup
func main() {
	mapTest()
	syncMapTest()
}
func mapTest() {
	m := map[int]int {1:1}
	startTime := time.Now().Nanosecond()
	w.Add(1)
	go writeMap(m)
	w.Add(1)
	go writeMap(m)
	//w.Add(1)
	//go readMap(m)

	w.Wait()
	endTime := time.Now().Nanosecond()
	timeDiff := endTime-startTime
	fmt.Println("map:",timeDiff)
}

func writeMap (m map[int]int) {
	defer w.Done()
	i := 0
	forI < 10000 {/ / locking s.L ock [1] () m = 1 / / unlock s.U nlock () i++}} funcreadMap (m map[int]int) {
	defer w.Done()
	i := 0
	for i < 10000 {
		s.RLock()
		_ = m[1]
		s.RUnlock()
		i++
	}
}

func syncMapTest() {m := sync.map {} m.tore (1,1) startTime := time.now ().nanosecond () w.dd (1) go writeSyncMap(m) w.dd (1) go writeSyncMap(m) //w.Add(1) //goreadSyncMap(m)

	w.Wait()
	endTime := time.Now().Nanosecond()
	timeDiff := endTime-startTime
	fmt.Println("sync.Map:",timeDiff)
}

func writeSyncMap (m sync.Map) {
	defer w.Done()
	i := 0
	forI < 10000 {m.tore (1,1) I ++}} funcreadSyncMap (m sync.Map) {
	defer w.Done()
	i := 0
	for i < 10000 {
		m.Load(1)
		i++
	}
}
Copy the code

situation	The results of
Just write	Map: 1022000 sync. The map: 2164000
Read and write	Map: 8696000 sync. The map: 2047000

You’ll find that in a heavy write scenario, sync.Map is less efficient than Map +metux because there are more operations in it.

5. Diffusion?

Mysql > alter table lock sync.RWMutex alter table lock sync.

Sync. Map is actually equivalent to table locking, but with two more maps, the essence is the same. In this case, the optimization direction is obvious: reduce the granularity of locks as much as possible to increase the speed.

Hash a large map, n small maps inside, hash the small map based on the key, so the granularity of locking is 1/n. Online to find the next, really have big guy to achieve: click here

(Yes, I am lazy, haha, this is a copy of my previous article)

If you feel that you have gained ~ you can follow the public account “brown Alpaca”, the first time to get my share and interview experience oh ~

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Let’s talk about Golang’s Sync.Map from the beginning

preface

The body of the

1. Why sync.map?

1.1 What Are the Problems of Concurrent Map Writing?

1.2 Low version solution

2. How to use sync.map?

Sync.map source code implementation?

3.1 Infrastructure

3.2 the query

3.3 delete

3.4 Increase (change)

4. Pros and cons of Sync.map?

5. Diffusion?

Let’s talk about Golang’s Sync.Map from the beginning

preface

The body of the

1. Why sync.map?

1.1 What Are the Problems of Concurrent Map Writing?

1.2 Low version solution

2. How to use sync.map?

Sync.map source code implementation?

3.1 Infrastructure

3.2 the query

3.3 delete

3.4 Increase (change)

4. Pros and cons of Sync.map?

5. Diffusion?

Related Posts

How to index a string?

Conditional control in Python!

If you follow the system pagination tool -PageHelper chapter 9