More than 4000 words explain the inner workings of Codis

Codis is a distributed Redis solution that can manage a large number of Redis nodes. As a professional third-party push service provider, Twitter has been focusing on providing efficient and stable message push service for developers for many years. Tens of billions of messages are sent through one platform every day. Based on the high requirements of individual push and push services on data volume, concurrency and speed, the practice found that the performance of a single Redis node is prone to bottlenecks. After comprehensive consideration of various factors, we chose Codis to better manage and use Redis.

With the rapid growth of the company’s business scale, our demand for data storage is also increasing. Practice shows that under a single Node instance of Redis, high concurrency and massive data storage can easily cause memory explosion.

In addition, the memory of each Redis node is also limited for the following two reasons:

First, the memory is too large. When data is synchronized in full mode, the time is too long, which increases the risk of synchronization failure. Second, more and more Redis nodes will lead to huge maintenance costs in the later period.

Therefore, we conducted in-depth research on Twemproxy, Codis and Redis Cluster, three mainstream Redis node management solutions.

The biggest drawback of Twitter’s open source Twemproxy is its inability to scale smoothly. However, Redis Cluster requires that the client must support the Cluster protocol. Using Redis Cluster requires upgrading the client, which is a great cost for many existing businesses. In addition, the P2P mode of Redis Cluster increases the communication cost, and it is difficult to know the current status of the Cluster, which undoubtedly increases the difficulty of operation and maintenance.

Wandoupod’s open source Codis can not only solve the problem of Twemproxy expansion and shrinkage, but also compatible with Twemproxy, and takes the lead in maturing and stabilizing when Redis Cluster (official Redis Cluster solution) has frequent vulnerabilities. So we ended up using a cluster solution called Codis to manage a large number of Redis nodes.

At present, individual push services use Redis and Codis comprehensively, small business lines use Redis, and business lines with large amount of data and large number of nodes use Codis.

We need a clear understanding of how Codis works internally to better ensure the stable operation of Codis clusters. Let’s take a look at how Codis’s Dashboard and Proxy work from a Codis source perspective.

Codis is a proxy middleware, developed in GO language. The position of Codis in the system is shown in the figure below:

The bottom layer of Codis will handle the forwarding of requests, non-stop data migration, etc. For the front client, Codis is transparent, you can simply think that the client is connected to a Redis service with infinite memory.

Codis consists of the following four parts: Codis Proxy (CODIS-proxy) Codis Dashboard Codis Redis (CODIS-Server) ZooKeeper/Etcd

Codis architecture

Iv. The internal working principles of Dashboard

Dashboard Is a cluster management tool used by Codis. All cluster operations, including proxy and server addition, deletion, and data migration, must be performed using the Dashboard. The startup process of Dashboard is the initialization of some necessary data structures and operations on the cluster.

Dashboard Startup Process The Dashboard startup process consists of New() and Start().

When the New() phase ⭕ starts, it first reads the configuration file and populates the config information. If the coordinator value is “Zookeeper” or “etcd”, a ZOOkeeper or ETCD client is created. Create a Topom{} object according to config. Topom{} is important because it contains information about all the nodes in the cluster (slot, group, server, etc.) at one point in time. The New() method assigns a value to the Topom{} object.

⭕ then start port 18080 to listen to and process the corresponding API requests.

⭕ finally starts a background thread that cleans up invalid clients in the pool every minute.

Below is the data structure that corresponds to Dashboard in memory when New().

⭕ Start(), write model.Topom{} to zk, /codis3/codis-demo/ Topom.

⭕ Set topom.online=true.

⭕ then obtains the latest slotMapping, group, proxy and other data from ZK through topom. store and fills it into Topom. cache (topom.cache, this cache structure, If the value is empty, obtain information such as slotMapping, proxy, and group from the ZK through store and fill the cache. Not only is the cache empty on the first boot, but if elements in the cluster (server, slot, etc.) change, dirtyCache is called and the information in the cache is set to nil, so that the next time the latest data is fetched from ZK via topom.store.

⭕ finally starts four Goroutine for loops to process the action.

Creating a group The process of creating a group is simple. ⭕ First, we pull the latest slotMapping, group, and proxy data from ZK through topom. store and fill them into Topom. cache.

⭕ verifies that the group ID already exists and is in the range from 1 to 9999 based on the latest data in memory.

⭕ then create a group{} object in memory and call zkClient to create path /codis3/codis-demo/group/group-0001.

Initially, this group is empty. { “id”: 1, “servers”: [], “promoting”: {}, “out_of_sync”: false }

Add COdis Server ⭕ Next, add codis Server to the group. The Dashboard connects to the back-end CODIS server to check whether the node is running properly.

⭕ Then run the slotsinfo command on the COdis server. If the command execution fails, the cordis server add process terminates.

After ⭕, pull the latest slotMapping, group, and proxy data from ZK through topom. store and fill them into Topom. cache. Check the latest data in the memory to determine whether the current group is switching from master to slave. Then check whether the Group Server already exists in ZK.

⭕ Finally, create a groupServer{} object and write zk. When codis server is added successfully, as we said above, Topom{} has four goroutine for loops at Start, Where RefreshRedisStats() puts the coDIS server connection into topom.stats.redisp.pool

Tips ⭕ Topom{} There are 4 goroutine for loops at Start, where RefreshRedisStats puts the coDIS server connection into topom.stats.redisp.pool.

⭕ RefreshRedisStats() is executed once per second to retrieve all codis servers from topom.cache, Then according to the codis server addr topom. Stats. Redisp. Pool. The Pool inside access to the client. If yes, run the info command. If not, create a client, add it to the pool, run the info command using the client, and save the info command execution result to topom.stats. Servers.

Codis Server Master/Slave synchronization After two nodes are added to a group, click the master/slave synchronization button to change the second node into the slave node of the first one.

⭕ The first step is to refresh topom.cache. We use topom. store to retrieve the latest slotMapping, group, proxy data from ZK and fill them into topom.cache.

⭕ Then make a judgment according to the latest data: group.promote.state! ActionNothing, indicating that the group Promoting is not empty, that is, the two Cordis servers in the group are performing a master/slave switchover, and the synchronization fails.

Group. Servers[index].action. State == Models.ActionPending, indicating that the status of the salve node is Pending and the primary/secondary synchronization fails.

⭕ After passing the judgment, obtain the maximum action.index value +1 of all COdis servers whose status is ActionPending, assign the value to the current CODIS server, and then set the status of the current slave node as: G.S ervers [index]. Action. State = models. ActionPending. Write this information into ZK.

⭕ Topom{} At Start, there are four Goroutine for loops, one of which specifically deals with master-slave synchronization.

After clicking the master/slave synchronization button on the ⭕ page, the corresponding data structure in the memory will change accordingly:

⭕ group information in ZK:

Topom{} At Start, there are four Goroutine for loops, one of which is specifically used to handle master-slave synchronization. So how do we do that?

First, obtain the latest slotMapping, group, proxy data from ZK through Topom.store and fill it into Topom. cache. After obtaining the latest cache data, obtain the group server that needs master/slave synchronization. Group.servers [index].action. State == models.ActionSyncing

Second, the dashboard connects to the node acting as salve, starts a Redis transaction, and executes a master/slave synchronization command:

C. end(” MULTI “) — > Start transaction C. end(” config “, “set”, “masterauth”, c.uth) C. end(” Slaveof “, host, port)

C.s. end (c.s. end “config”, “rewrite”) (” client “, “kill”, “type”, “normal”) c.D o (” exec “) – > things

⭕ After the primary/secondary synchronization command is executed, modify group.servers [index].action. State == “synced” and write it to zK. At this point, the entire master-slave synchronization process has been completed.

Codis Server in the master slave synchronization process, from the start to the completion of a total of five states:

“(ActionNothing) –> < span style = “max-width: 100%; When the status is empty pending(ActionPending) –> page click the master/slave synchronization and write to the ZK syncing(ActionSyncing) –> background goroutine for loop to process the master/slave synchronization Synced –> Goroutine for loop to handle the primary/secondary synchronization success, write the zK state synced_failed –> Goroutine for loop to handle the primary/secondary synchronization failure, write the ZK state

After adding Codis Servers to the Codis cluster and performing master/slave synchronization, 1024 slots were allocated to each Codis server. Codis provides a variety of ways for consumers to move a slot with a specified number to a specified group, or to move multiple slots in a group to another group. However, the most convenient way to rebalance is automatically.

By using Topom.store, we obtain the latest slotMapping, group, proxy data from ZK and fill it into Topom. cache. Generate slots allocation plans = {0:1, 1:1,… , 342, 3,… , 512, 2,… , 853, 2,… , 1023:3}, where key indicates the slot ID and value indicates the group ID. Next, we update slotMapping information according to the slots allocation plan: Action.State = ActionPending and Action.targetid = slot Assigned target group IDS and write the updated information back to zK.

Topom{} At Start, there are four goroutine for loops, one of which handles slot allocation.

SlotMapping:

● ProcessSlotAction() executes once per second, after a sequence of processing logic has been executed, it obtains the client from topom{}.action.redisp.pool.pool, and then executes SLOTSMGRTTAGSLOT on Redis. If the client can obtain the data, Dashboard will run the migration command on Redis. If no, create a client and add it to the pool. Then use the client to run the migration command.

SlotMapping provides seven action states:

5. Internal working principle of Proxy

Proxy Startup Process The proxy startup process consists of New(), Online(), reinitProxy(), and receiving client requests ().

The New() phase ⭕ starts by creating a Proxy{} structure object in memory and performing various assignments. ⭕ Next, start ports 11080 and 19000. ⭕ Then start three Goroutine background threads to process the corresponding operations: ●Proxy starts a Goroutine background thread and processes the request from port 11080; ● The Proxy starts a Goroutine background thread and processes requests for port 19000; ● The Proxy starts a Goroutine background thread and pings the COdis Server to maintain the backend BC.

In the Online() phase ⭕ first assign the id of model.proxy {}, id = ctx.maxProxyID () + 1. If ctx.maxProxyId() = 0 when the first proxy is added, the ID of the first proxy is 0 + 1.

⭕ Next, create a proxy directory in zK.

After ⭕, refresh the proxy memory data reinitProxy(CTX, P, C).

⭕ Fourth, set the following code: online = true proxy.online = true router.online = true jodis.online = true

⭕ Create the jodis directory in zK.

ReinitProxy () ⭕Dashboard obtains the latest slotMapping, group, and proxy data from ZK [M1] and fills it with topom.cache. According to the slotMapping and group data in the cache, the Proxy can obtain model.Slot{}, which contains the IP address and port of each Slot. Establish a connection to each CODIS Server, and then put the connection into the Router.

⭕ Redis requests are processed by a BackendConn taken out of sharedBackendConn. Proxy.Router stores the mapping between all sharedBackendConnPools and slots in the cluster, which is used to forward redis requests to corresponding slots for processing. The sharedBackendConnPools and slots in the Router are reinitProxy() to keep their values up to date.

Summarize the process of proxy startup. First read the configuration file and get the Config object. Next, create a Proxy according to Config and fill in the attributes of the Proxy. More important here is populating models.proxy (details can be found in ZK), connecting to ZK, and registering the associated paths.

Goroutine is then started to listen for and forward requests from the CODIS cluster on port 11080 and redis requests on port 19000. Next, the zK data is flushed to memory, and 1024 models. Slots are created in Proxy. Router based on models. During this process, the Router assigns a backendConn to each Slot to forward redIS requests to the corresponding Slot for processing.

The key allocation algorithm in Codis is to CRC32 the key first to get a 32-bit number, and then hash%1024 to get a remainder. This value is the slot corresponding to the key, and the slot corresponding to the redis instance.

How do we ensure that Slots does not affect the client’s business during migration? ⭕ The client sends the command to the proxy. The proxy calculates which slot the key corresponds to, for example, 30, and then goes to the proxy router to obtain slot {} containing backend. BC and migrate. If migrate. BC has a value, the slot is currently migrated, and the system will fetch Migrate. BC. Backend.bc. conn is displayed, and the backend COdis server is accessed.

Seven, the shortcomings of Codis and the improvement of the use of individual push

Shortcomings of Codis ⭕ lack of security considerations, Codis FE page does not have login verification function; ⭕ lacks its own multi-tenant solution. ⭕ The cluster capacity reduction solution is unavailable.

⭕ uses squid proxy to simply restrict the fe page access, and later carries on the second development based on FE to control login; ⭕ Small services add service IDS to the key prefix to reuse the same cluster. Large business use independent cluster, independent machine; ⭕ reduce capacity by manually migrating data, emptying nodes, and offline nodes.

Codis, as an important basic service of message push, is of vital importance to its performance. After Redis nodes are migrated to Codis, the problems of capacity expansion and operation and maintenance management are solved effectively. In the future, Getui will continue to follow Codis and discuss with you how it can be better used in production environments.

More than 4000 words explain the inner workings of Codis

Related Posts

What are some of the tools programmers use to boost their happiness?

Algorithm part two: The best, worst, average and amortized time complexity of complexity analysis

Django modifs request attributes: This QueryDict instance is immutable