Go sync.map source code analysis

An overview of the

Map in Go language is not concurrency safe. Before Go 1.6, reading and writing map concurrently will lead to reading dirty data. After 1.6, the program will panic directly. Therefore, previous solutions are generally processed by introducing RWMutex(read and write lock). As for why GO does not support atomic operation of Map, generally speaking, operation on map atom reduces the performance of scenarios with only concurrent read or no concurrent read and write to some extent. However, as a server writing a service using Go, gorutine can access a map concurrently in most cases. Therefore, after 1.9, Go introduced a concurrency-safe map in the Sync package. This will be interpreted from the source code.

1. Methods provided by sync.map

Store data. Store keys and values of any type.

func (m *Map) Store(key, value interface{})
Copy the code

Deleting corresponding Keys

func (m *Map) Delete(key interface{})
Copy the code

Read the value of the corresponding key. Ok indicates whether the key is queried in the map

func (m *Map) Load(key interface{}) (value interface{}, ok bool)
Copy the code

If the value of loaded is true, the value exists; if the value of loaded is false, the value does not exist.

func (m *Map) LoadOrStore(key, value interface{}) (actual interface{}, loaded bool)
Copy the code

The callback function calls the key and value iterated. When the callback function returns false, the iterated is complete. Otherwise, all keys are iterated.

func (m *Map) Range(f func(key, value interface{}) bool)
Copy the code

Principle 2.

By introducing two maps, read and write are separated into different maps, where the Read map only provides reads and the dirty map only writes. In this way, concurrent reads can be performed without locking. When no value is read in a Read map, subsequent reads are locked and the number of misses is added. When the number of misses reaches a certain number, the dirty map is upgraded to a Read map.

In addition, although two maps are introduced, the underlying data store is a pointer to the same value.

For example, when keys 1,2, and 3 are inserted into a dirty map, the read map has no key value, and the dirty map is read and the number of misses is recorded

When the number of Misses is greater than or equal to the length of a dirty map, a dirty map is upgraded to a Read map. In this case, addresses of dirty maps are copied directly.

When a new key 4 is inserted, the key value in the Read map is copied to the dirty map, so that the dirty map contains all the values, and the address is copied directly when the dirty map is upgraded to a Read Map.

3. Source code analysis

3.1 Main Structure

The entry structure, which holds the interface pointer to the value, operates atomically through the atomic.

type entry struct {
	p unsafe.Pointer // *interface{}
}
Copy the code

Map structure, main structure, provides external methods, and data storage.

type Map struct {
	mu Mutex    

    // Store the readOnly and read it concurrently without locking it
	read atomic.Value // readOnly

    // Dirty map is used to store written data and can be upgraded directly to read map.
	dirty map[interface{}]*entry

    // Misses mainly record the number of times when the read map and dirty map are not read.
	misses int
}
Copy the code

ReadOnly structure, used primarily for storage

// readOnly is stored atomically in map.read,
type readOnly struct {
	m       map[interface{}]*entry
	amended bool // true if the dirty map contains some key not in m.
}

Copy the code

3.1 the Load method

func (m *Map) Load(key interface{}) (value interface{}, ok bool) {
	read, _ := m.read.Load().(readOnly)
	e, ok := read.m[key]
	if! ok && read.amended { m.mu.Lock()// Lock the dirty map and then read the contents of the read map. This prevents the data from being read from the dirty map.
        read, _ = m.read.Load().(readOnly)
		e, ok = read.m[key]
		if! ok && read.amended { e, ok = m.dirty[key]// Record the number of misses. Before the dirty map is promoted to read map,
            // This key must be read in the dirty map with a lock.
			m.missLocked()
		}
		m.mu.Unlock()
	}
	if! ok {return nil.false
	}
	return e.load()
}
Copy the code

3.2 Store method

// Store sets the value for a key.
func (m *Map) Store(key, value interface{}) {
    // If a value is read in the Read Map, an atomic operation is attempted to update the value directly
	read, _ := m.read.Load().(readOnly)
	if e, ok := read.m[key]; ok && e.tryStore(&value) {
		return
	}

    // If the value is not read in the read map or the update fails when the value is read, the lock is used for subsequent processing
	m.mu.Lock()
	read, _ = m.read.Load().(readOnly)
	if e, ok := read.m[key]; ok {
        // If the value read is deleted, write the value to the dirty map
		if e.unexpungeLocked() {
			m.dirty[key] = e
		}
        // Use atomic operations to update the value of the key
		e.storeLocked(&value)
	} else if e, ok := m.dirty[key]; ok {
        // If a value is read in the dirty map, the value is updated directly using atomic operations
		e.storeLocked(&value)
	} else {
        // If there is no value in the dirty map, the dirty map has been upgraded to read map or entered for the first time
        // Initialize the dirty map and add the read map key to the newly created dirty map.
		if! read.amended { m.dirtyLocked() m.read.Store(readOnly{m: read.m, amended:true})
		}
		m.dirty[key] = newEntry(value)
	}
	m.mu.Unlock()
}
Copy the code

3.3 LoadOrStore method

The code logic is similar to Store

func (m *Map) LoadOrStore(key, value interface{}) (actual interface{}, loaded bool) {
	// Read the map without locking
	read, _ := m.read.Load().(readOnly)
	if e, ok := read.m[key]; ok {
        // Try to update or read the value if it is read
		actual, loaded, ok := e.tryLoadOrStore(value)
		if ok {
			return actual, loaded
		}
	}

	m.mu.Lock()
	read, _ = m.read.Load().(readOnly)
    // Determine a read map at the request of the lock
	if e, ok := read.m[key]; ok {
		if e.unexpungeLocked() {
			m.dirty[key] = e
		}
		actual, loaded, _ = e.tryLoadOrStore(value)
	} else if e, ok := m.dirty[key]; ok {
		actual, loaded, _ = e.tryLoadOrStore(value)
		m.missLocked()
	} else {
		if! read.amended { m.dirtyLocked() m.read.Store(readOnly{m: read.m, amended:true})
		}
		m.dirty[key] = newEntry(value)
		actual, loaded = value, false
	}
	m.mu.Unlock()

	return actual, loaded
}
Copy the code

3.4 Range method

func (m *Map) Range(f func(key, value interface{}) bool) {

    // Get the median value of the read map
	read, _ := m.read.Load().(readOnly)
    // If there are still values in the dirty map, lock detection is performed
	if read.amended {
		m.mu.Lock()
		read, _ = m.read.Load().(readOnly)
        
		if read.amended {
            // Assign the dirty map to read, because dirty  The map contains all the values
			read = readOnly{m: m.dirty}
			m.read.Store(read)
			m.dirty = nil
			m.misses = 0
		}
		m.mu.Unlock()
	}

    // perform traversal
	for k, e := range read.m {
		v, ok := e.load()
		if! ok {continue
		}
		if! f(k, v) {break}}}Copy the code

3.5 the Delete method

func (m *Map) Delete(key interface{}) {
    // Get the read map first
	read, _ := m.read.Load().(readOnly)
	e, ok := read.m[key]
	if! ok && read.amended { m.mu.Lock()// Add a lock
		read, _ = m.read.Load().(readOnly)
		e, ok = read.m[key]
        // No value in read map, delete in dirty map
		if! ok && read.amended {delete(m.dirty, key)
		}
		m.mu.Unlock()
	}
	if ok {
		e.delete()}}Copy the code

4. The limitations

Sync. map is not suitable for scenarios where there are a lot of simultaneous reads and writes. A large number of writes will cause the Read map to fail to read data and lock it for further reads. Meanwhile, the dirty map is upgraded to read map. The overall performance is low, especially in cache scenarios. Sync. map is suitable for append-only and heavy read and small write scenarios.

For a map, there is another hash based implementation method. In this way, lock the read and write of the map, allocate N maps, and determine which map to allocate according to the hash operation on the key. This reduces the cost of locking to 1/ N (theoretical value)

In contrast, the hash-based approach is easier to understand and provides more stable overall performance. Sync.map may perform worse in some scenarios, but in others it provides better performance. Therefore, it depends on the specific business scenario.

Reference 5.

go

sync.map