1 connection between zK leader and follower
(1) After the leader is elected, the status changes and the lead() method is executed. The lead() method actually starts a LearnCnxAcceptor thread to accept requests from other followers/Observers. Using the traditional BIO,
A LearnerCnxAcceptor thread start b LearnerCnxAcceptor thread call Accept and wait for other follower connections to be established C The start D LearnerHandler thread represents requests from the followers to send and receive. The while loop blocks requests from the followers, including registration. Ack, ping, etc. E After reading the request, the request data is parsed and processed through jute's deserialization and custom protocol rules.Copy the code
(2) After the follower identity is established, the status changes and Follwer’s followLeader method is executed to do the following:
A findLeader to find which machine the leader is b connectToLeader to initiate a connection to the leader. At this time, if the connection is established by the leader, LearnerHandler C will be generated to initiate registration with the leader, mainly containing follower information. D While loop, blocking read and process the request sent by the leader, proposal commit and other operations.Copy the code
The leader and follower mainly use Jute for serialization/deserialization. Data starts and ends with words, such as type XXXXX type, which represents the content of type.
2 The connection between the client and server is established
(1) The client starts, which is actually a new Zookeeper instance. In the constructor, a ClientCnxn component is initialized to communicate with the server connection. ClientCnx starts with a while loop in the run method that repeats ping and packets
If a is not connected, the registerAndConnect of ClientCnxnSocketNIO creates a connection. After the connection is established, the interval between the last ping is determined. Sending heartbeat through Sendping and server c Based on clientCnxnSocket encapsulated in the underlying layer, read and write events are traversed through the selectKey of NIO to process read and write events.Copy the code
(2) The server is the NIOServerCnxnFactory started when quorumpeer is started, using the general NIO. The number of threads used is calculated by configure. In fact, it is the worker thread with 2* CPU cores and a fixed ACCEPT thread to establish connection. Since there are not necessarily many clients, one Accept will suffice. Next, Accept’s Run accepts the Accept event and registers the read event, using SELECT to handle specific read and write events.
3 the session management
(1) Session creation process:
After a ZooKeeper client is created, it will ping to the server periodically and request into its own queue. Then the NIO component consumption queue sends the request to the server through the connection mentioned above. B Server receives the request through the NIOServerCnxnFactory network component started. Unpack transformation. C If there is no session on the first request, processConnectRequest is entered and a unique sessionID is created using the sessionTraker component. D Then save the session to the map with the sessionID as the key, divide buckets by touch, and manage the seesion declaration period. Each subsequent request will be touched to get a new expired bucket. E This session requests submit to the processing chain. F processes the chain one layer at a time and writes the result back to the client. If you create a session, you don't need to do much to process the chain. The chain will be used for the subsequent proposal and commit.Copy the code
(2) SessionID: obtain unique ID through machine id (myID configured) + timestamp displacement. Bucket division strategy: Context Context Context = Context Context = Context context = Context context = Context = Context = Context = Context = Context The interval is ExpirationInterval (tickTime configured by default). Session status rotation The main purpose of a session is to determine whether temporary nodes are still retained.
A Connected/Reconnected->Suspended indicates that the heartbeat times out and the Session may fail (temporary nodes may disappear). B Suspended->Lost indicates that the Server has been connected to the Server and the Server has confirmed the Session timeout, or the client cannot connect to the Server for a long time, so the Session cannot be used completely (the temporary node has disappeared); C Suspended->Reconnected reconnects to the Server and the Session can continue to use (temporary nodes still exist); D Suspended->Lost->Reconnected Reconnects to the Server, but the Session has been re-created (temporary nodes need to be re-created)Copy the code
(4) Session touch activation
When the client sends an activation request:
A Session activation is triggered when the client sends requests, including read/write requests, to the server. B If the client does not communicate with the server within sessionTimeout/3, it initiates a ping request. After receiving the request, the server triggers session activation.Copy the code
(5) Session detection and clearing sessionTracker’s run method periodically detection,
A sets the session status bit to close and clear expired buckets. B submits a local request to the processing chain for closeSession and performs a 2PC deletion like a normal request to delete temporary memory nodes. Session stores all temporary nodes created by itself, making it easy to delete them directly. It will also trigger Watcher. C to remove the session, the session object to remove sessionTracker removeSession (sessionId); D From the NIOServerCnxFactory, close the connection corresponding to the session according to the standard NIO closing process.Copy the code
4 2PC submission process analysis
Start from create on the client and send it to the leader machine of zKServer. The leader has his own proposal processing chain, sends the proposal to all followers, and collects the returned ACKS with an ACK set of the proposal. After half of the acks are returned, the commit is triggered
A client create, the request enters the client sending queue outgoingQueue, and the SendThread thread of ClientCNX sends the request to the NIOServerCnxnFactory component of zKServer and submits it to the processing chain. If the request reaches the follower, the follower completes the forwarding. 2.1 b reach ProposalRequestProcessor processing chain, open, into the proposal method, into the outstandingProposals queue, traverse all leaner components (learner here can be viewed as followers), Send the proposal to each follower c After receiving the proposal request, the follower processes the data packet and submits it to the Synprocessor thread for flushing the memory database and disk. Flush is when the packet queue is empty or 1000 transaction data is accumulated. The ACK is then returned via SendAckRequestProcessor, whose names are easy to understand. D After receiving the ACK of the proposal, the leader collects statistics while collecting the ack. If more than half of the ack is received, the leader should trycommit, commit himself first, and then asynchronously send the COMMIT to all the followers. After the leader commits, it can be used externally without waiting for the ACK of the commit. The leader commits by FinalRequestProcessor, writes the zkDatabase to the memory database, and feeds back the result to the client. E The follower updates the data after receiving the commit. The logic is the same as that of the leader's commit.Copy the code
A few more details:
(1) leader receives the written request, the first processor, PreRequestProcessor generated in the transaction id, zxid how to generate? Incrementing hzxID.incrementandGet ();
(2) Storage of the full data snapshot: SyncRequestProcessor will determine whether to drop the overall snapshot (according to the size of the configuration) when the continuous read transaction requests memory storage. If necessary, it will start the disk write thread and write the memory tree to the disk through serialization.
Or if a qurumPeerMain is set up with a DatadirCleanupManager, PurgeTask can be used to clean up files because snapshots are set to their full length.
(4) Processing chain of leader and follower: