“This is the 27th day of my participation in the First Challenge 2022. For details: First Challenge 2022”
【 K8S series 】 K8S learning 27-7, K8S own principle of high availability
Speaking of high availability, when we are using a host environment (not K8S), we have used this method for high availability:
- The active and standby servers are deployed in active/standby mode. When both the active and standby nodes are alive, only the active node provides external services, and the standby node retakes the role immediately after the active node fails
- In addition, in order to provide service availability, remote multi-work is implemented, service access nodes are added, and traffic is shunted
- For the database also do backup, regular synchronization, hot or cold backup
So the previous shared so much about the principle of k8S components, we can look back, why do we choose K8S?
Simply put, it is because of the characteristics of K8S itself:
- Load balancing is required
- Service self-healing
- Managed services can also be expanded horizontally
- When you upgrade, you can scroll through the upgrade, smooth the transition, the whole upgrade process can be very smooth,
- If an upgrade exception occurs, you can perform one-click rollback. During the rollback, the original service is not affected
All of these are things we need to spend a lot of manpower to do in the host environment. Therefore, we finally choose service deployment on K8S, which can greatly reduce the mental burden of development and operation and maintenance and operation and maintenance cost
So how does k8S guarantee high availability?
How can we think about high availability?
From the perspective of pod
In terms of pod availability, we mentioned earlier that pod can be managed by using the advanced resource Deployment, creating, updating, and deleting pods, which can be smoothly upgraded and rolled back using Deployment
Of course, the default is stateless POD
If our service is stateful and running in a POD, we can still use Deployment, but only with one copy, otherwise there will be an impact
If you have data volumes, you can also use the StatefulSet resource to manage them
However, if our POD is down, the service will be interrupted for a period of time while the POD is restarted to provide services
Stateful service, a highly available mode that cannot scale horizontally
Stateful services, which cannot scale horizontally, can also be handled using the master/slave approach I mentioned in the beginning
Similarly, we can use the leadership election mechanism to elect a service from the same stateful service to process requests externally
Use the same method to select a valid service from the remaining services to handle incoming requests until the service fails
The specific algorithm and implementation in K8S will be shared in a later article
From the point of etcd
Etcd, we used etCD when we were using the host environment
The key value is the directory of the service or a string with /, and the value value is the IP address and port of the service
** ETCD itself is designed to be a distributed system that can run multiple instances of ETCD itself, making it easy to be highly available by nature
Generally, we will deploy 3, 5 or 7 clusters, for reasons of **, you can try to see the Redis cluster deployment chapter sharing **
Take a look at this schematic:
Multi master, multi worker schematic diagram
From the perspective ApiServer
From ApiServer’s point of view, it’s even simpler
The component itself is stateless and does not cache data. It communicates directly with its own independent ETCD. For components in the node, requests are sent to any ApiServer in the master
Because the ETCD components behind ApiServer are distributed, their data is replicated across instances
In multi-master mode, when workers communicate with master nodes, they need to pass through a load balancer, which can split the traffic of multiple nodes and ensure that workers’ requests can be correctly sent to healthy ApiServer
From the scheduler scheduler and controller Manager
Scheduler and controller managers are less simple and convenient than ApiServer because they involve resource management conflicts
For them, most of what they do is listen, when multiple controllers listen for resource changes in an ApiServer
So, for example, three ReplicaSet controllers, all listening to ApiServer, increase the number of replicas by two
At that point, all three ReplicaSet controllers listen, and in order to meet expectations, they do what is expected, and eventually, six new PODS are created in the environment
Of course, this is not what we expect. It would be a waste of resources
Therefore, both the controller manager and the scheduler are actively monitoring the state of the cluster, and to avoid bad competition, the above similar idea of master/slave is used
Give them the –leader-elert option, which by default picks the master, and whoever is the master does the actual action, whatever needs to be done after listening, while the rest waits for the master to die
Well, that’s all for today
Today is here, learning, if there is a deviation, please correct
Welcome to like, follow and favorites
Friends, your support and encouragement, I insist on sharing, improve the quality of the power
All right, that’s it for this time
Technology is open, our mentality, should be more open. Embrace change, live in the sun, and strive to move forward.
I am Nezha, welcome to like, see you next time ~