preface
Consistent Hashing algorithm (Consistent Hashing and Random Trees) was first described in the paper Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web was proposed. In simple terms, a consistent hash organizes the entire hash space into a virtual ring (0-2^32-1).
advantage
- Scalability. Consistent hashing algorithm ensures the least change of data storage when adding or reducing servers, and saves the cost of data movement greatly compared with traditional hashing algorithm.
- Better adapt to the rapid growth of data. Distribution data based on the consistency of hash algorithm, when the data is growing, some virtual node may contain a lot of data, data on the virtual node distribution is not balanced, this time can will contain the data of virtual node split, this division is only the virtual node into two of the original, don’t need to rethink all data hash and division. If the load on physical servers is still unbalanced after virtual nodes are split, you only need to adjust the storage distribution of some virtual nodes among servers. This allows the number of physical servers to be dynamically expanded as the data grows, at a much lower cost than redistributing all the data using traditional hashing algorithms.
practice
The consistent hash algorithm in this paper is implemented
- Configure virtual nodes on the server to balance nodes
- The weight of the server can be set according to the pressure the server can bear
package main
import (
"crypto/sha1"
"fmt"
"sort"
"strconv") // Server structure address and storage weighttypeServer struct {addr String weight int} var Servers []server // DefaulthashNode count const defaultNodeNum = 100 // Base virtual nodetype virtualNode struct {
nodeKey string
spotVal uint32
}
type nodes struct {
virtualNodesArray []virtualNode
}
func (p *nodes) Len() int { return len(p.virtualNodesArray) }
func (p *nodes) Less(i, j int) bool { return p.virtualNodesArray[i].spotVal < p.virtualNodesArray[j].spotVal }
func (p *nodes) Swap(i, j int) { p.virtualNodesArray[i], p.virtualNodesArray[j] = p.virtualNodesArray[j], p.virtualNodesArray[i] }
func (p *nodes) Sort() {sort.sort (p)} // Generate corresponding uint32 func getUint32Val(s string) (V uint32) {// sha1 h := sha1.new () defer H.reset () h.Write([]byte(s))hashBytes := h.sum (nil) // Go language bit operator processingif len(hashBytes[4:8]) == 4 {
v = (uint32(hashBytes[3]) << 24) | (uint32(hashBytes[2]) << 12) | (uint32(hashBytes[1]) << 6) | (uint32(hashBytes[0]) << 3)
}
return
}
func (p *nodes) setVirtualNodesArray(servers []server) {
if len(servers) < 1 {
return} // Maintain a map based on weights and nodes - all of themhashThe value on the circle corresponds to IPfor_, v := range Servers {totalVirtualNodeNum := defaultNodeNum * v.eightfori := 0; i < totalVirtualNodeNum; I ++ {iString := strconv.itoa (I) // virtualAddr := fmt.sprintf("%s:%s", v.addr, iString) virNode := virtualNode{ nodeKey: v.addr, spotVal: getUint32Val(virtualAddr), } p.virtualNodesArray = append(p.virtualNodesArray, Func (P * Nodes) getNodeSever(W uint32) (addr String){I := sort.Search(len(p.virtualNodesArray), func(i int) bool {return p.virtualNodesArray[i].spotVal >= w })
return p.virtualNodesArray[i].nodeKey
}
func main() {
vNodes := new(nodes)
servers = append(servers, server{"127.0.0.1", 1}, server{"127.0.0.2", 2}, server{"127.0.0.3", 3}) / / ascribed values, to generate virtual node vNodes. SetVirtualNodesArray (servers) / / to the corresponding file name, as the key file fname: ="demo.jpg"
uint32Val := getUint32Val(fname)
ser := vNodes.getNodeSever(uint32Val)
fmt.Println("File storage server",ser)
}
Copy the code
additional
This is just the basic use of consistent hash. If you have any ideas on algorithm optimization or other function addition, the author is very happy to learn and improve with you.