A target: container operation; Two and three centers; Four-tier service discovery; Five Pod shared resources; Six CNI common plug-ins; Seven layer load balancing; Eight isolation dimensions; Nine network model principles; Class 10 IP addresses; Hundred-level product line; Thousand-level physical machine; Ten thousand class container; If there is no 100 million, K8s has 100 million: 100 million daily service.

A target: container operation


\

Kubernetes (K8S) is an open source platform for automated container operations. These container operations include deployment, scheduling, and scaling across node clusters.

Specific functions:

  • Automate container deployment and replication.
  • Real-time elastic shrink container scale.
  • Containers are organized into groups and load balancing is provided between containers.

Scheduling: which machine the container is running on.

Composition:

  • Kubectl: A client-side command line tool that serves as an entry point for the entire system.
  • Kube-apiserver: Provides interfaces in the form of REST API services and serves as the control entry for the entire system.
  • Kube-controller-manager: Performs background tasks of the entire system, including node status, Pod number, Pods and Service association, etc.
  • Kube-scheduler: Is responsible for node resource management and receives the create Pods task from kube-Apiserver and assigns it to a node.
  • Etcd: Responsible for service discovery and configuration sharing between nodes.
  • Kube-proxy: Runs on each compute node and is responsible for the Pod network proxy. Periodically obtain service information from etCD to make corresponding policies.
  • Kubelet: Runs on each compute node and acts as an agent to receive the Pods tasks assigned to the node and the management container. It periodically obtains the container state and feeds back to kube-apiserver.
  • DNS: An optional DNS Service that creates a DNS record for each Service object so that all PODS can access the Service through DNS.

The following is the architecture topology of K8s:

Two and three centers


\

Two sites and three centers include the local production center, local Dr Center, and remote Dr Center.

Data consistency is an important problem to be solved by two – and three-center systems. K8s uses the ETCD component as a highly available, strongly consistent repository for service discovery. Used to configure sharing and service discovery.

It’s a project inspired by Zookeeper and Doozer. In addition to having all of their features, they also have the following 4 features:

  • Simple: The HTTP + JSON-based API makes it easy to use curl.
  • Security: SSL client authentication is optional.
  • Fast: Each instance supports 1000 writes per second.
  • Trusted: Fully distributed using Raft algorithm.

Four layers of service discovery


\

First, a graph to explain the network seven-layer protocol:

K8s provides two ways to do service discovery:

Environment variables: When creating a Pod, Kubelet injects environment variables for all services in the cluster into the Pod. Note that in order to inject a Service environment variable into a Pod, the Service must have been created before the Pod. This almost makes this approach to service discovery unusable.

For example, if the ServiceName is redis-master and the ClusterIP:Port is 10.0.0.11:6379, the corresponding environment variable is:

DNS: KubeDNS can be created easily in cluster add-on mode to discover services in the cluster.

The above two methods, one is based on TCP, as you know, DNS is based on UDP, they are built on top of the four layer protocol.

Five pods share resources


\

Pod is the most basic operating unit of K8s, which contains one or more closely related containers. A Pod can be regarded as a containerized environment as the “logical host” of the application layer. Multiple container applications in a Pod are usually tightly coupled, with pods created, started, or destroyed on nodes; Each Pod runs a special mount Volume called Volume, so communication and data exchange between them is more efficient. We can take advantage of this feature in design by putting a set of closely related service processes into the same Pod.

Containers in the same Pod can communicate with each other only by localhost.

The application container in a Pod shares five resources:

  • PID namespace: Different applications in Pod can see the process ids of other applications.
  • Network namespaces: Multiple containers in Pod can access the same IP and port range.
  • IPC namespace: Multiple containers in Pod can communicate using SystemV IPC or POSIX message queues.
  • UTS namespace: Multiple containers in Pod share a host name.
  • Volumes(shared storage Volumes) : Each container in Pod can access Volumes defined at the Pod level.

The lifecycle of the Pod is managed by the Replication Controller; It is defined in a template and then allocated to a Node to run on. The Pod ends when the container containing the Pod runs.

Kubernetes designed a unique set of network configurations for pods, including assigning each Pod an IP address, and using the host name of the Pod to communicate during the Pod title.

Six CNI common plug-ins


\

Container Network Interface (CNI) is a set of standards and libraries for Linux Container Network configuration. Users need to develop their own Container Network plug-ins according to these standards and libraries. CNI focuses only on container network connections and resource release when containers are destroyed, so CNI can support a large number of different network modes and is easy to implement.

Here is a picture of the six CNI common plug-ins:

Seven layers of load balancing


\

Load balancing has to start with communication between servers.

An Internet Data Center (IDC), also called a Data Center or equipment room, is used to house servers. The IDC network is a communication bridge between servers.

There are a lot of network devices in the picture above. What are they for?

Routers, switches, and MGW/NAT are network devices that have different roles based on performance and extranets.

  • Intranet access switch: Also known as top of rack (TOR), a device that connects servers to the network. Each Intranet access switch is connected to 40 to 48 servers. A network segment with a mask of /24 is used as the internal network segment of the servers.
  • Intranet core switch: Forwards traffic of Intranet access switches in the IDC and cross-IDC traffic.
  • MGW/NAT: MGW (LVS) is used for load balancing, and NAT is used for translating addresses when Intranet devices access the Internet.
  • Extranet core router: Connects Meituan to the extranet platform through static interconnection operators or BGP.

Let’s start with layer load balancing:

  • Layer 2 load balancing: Layer 2 load balancing based on MAC addresses.
  • Layer 3 load balancing: IP address-based load balancing.
  • Layer 4 load balancing: IP+ port-based load balancing.
  • Layer 7 load balancing: Load balancing is performed based on application-layer information such as urls.

Here’s a diagram of the difference between layer 4 and layer 7 load balancing:

The above four layers of service discovery mainly talk about k8S native Kube-proxy mode. K8s exposes the service mainly through NodePort mode, by binding a port of minion host, and then pod request forwarding and load balancing, but this method has the following disadvantages:

  • There may be many services. If each is bound to a node host port, the host needs to open the peripheral ports for Service invocation, which leads to confusion in management.
  • Firewall rules required by many companies cannot be applied.

Ideally, an external load balancer would bind to a fixed port, such as 80, and then forward to the following Service IP address based on the domain name or Service name. Nginx solves this requirement well, but the problem is how to modify the Nginx configuration and load the configuration if a Service is added. Kubernetes’ solution is Ingress. This is a solution based on 7 layers.

Eight dimensions of isolation


\

\

On the K8s cluster scheduling side, you need to implement the corresponding scheduling policies for the isolation from the top to the bottom and from the coarser-grained to the fine-grained.

Nine network model principles


\

The K8s network model should comply with four basic principles, three network requirements principles, one architecture principle, and one IP principle.

Each Pod has a separate IP address, and assuming that all pods are in a directly connected, flat network space, they can be accessed via the Pod’s IP regardless of whether they are running on the same Node.

The IP of Pod in K8s is the minimum granularity IP. All containers in the same Pod share a network stack, which is called the IP-per-POD model.

  • Pod the actual IP assigned by Docker0
  • The IP address and port seen inside the Pod are the same as those seen outside
  • Different containers within the same Pod share a network and can access each other’s ports via localhost, similar to different processes within the same VM.

From the perspectives of port allocation, domain name resolution, service discovery, load balancing, and application configuration, THE IP-per-POD model can be regarded as an independent VM or physical machine.

  • All containers can communicate with other containers without using NAT.
  • All nodes can work with all containers in different NAT modes and vice versa.
  • The address of the container is the same as the address seen by others.

It should conform to the following architecture:

From the architecture above, the concept of IP extends from the outside to the inside of the cluster

Class 10 IP addresses


\

As you all know, IP addresses are classified into ABCDE classes, and there are also 5 types of special purpose IP.

The first kind of

Class A: 1.0.0.0 to 1226.255.255.255. The default subnet mask /8 is 255.0.0.0. Class B: 128.0.0.0 to 191.255.255.255. The default subnet mask /16 is 255.255.0.0. Class C: 192.0.0.0 to 223.255.255.255. The default subnet mask /24 is 255.255.255.0. Class D: 224.0.0.0 to 239.255.255.255, generally used for multicast. Class E: 240.0.0.0-255.255.255.255(where 255.255.255.255 is the broadcast address of the whole network). Class E addresses are generally used for research purposes.Copy the code

The second type of

Strictly speaking, 0.0.0.0 is no longer an IP address. It represents a collection of all unknown hosts and destination networks. By unclear, I mean that there is no specific entry in the local routing table indicating how to get there. As the default route. 127.0.0.1 Local address.Copy the code

The third kind

224.0.0.1 Multicast address. If your host has IRDP enabled (Internet Route Discovery, using multicast), then your host should have such a route in the routing table.Copy the code

The fourth class

169.254.x.x If the DHCP server is faulty or the response time is too long, the system will assign an IP address to the host, which means that the network cannot work properly.Copy the code

The fifth class

XXX, 172.16.x.x to 172.31.x.x, 192.168.x.x private addresses. A lot of it is used internally. Such an address is reserved to avoid confusion over which address is connected to the public network.Copy the code

Article source: stackpush article links: blog.csdn.net/huakai_sun/article/details/82378856