Author | Zhao Mingshan (Li Heng)

preface

OpenKruise is an open source Cloud Native application automation management suite of Ali Cloud. It is also a Sandbox project currently hosted under Cloud Native Computing Foundation (CNCF). It comes from alibaba’s containerized, cloud-native technology precipitation over the years. It is a standard extension component based on Kubernetes for large-scale application in Alibaba’s internal production environment. It is also a technical concept and best practice closely related to upstream community standards and adapted to the large-scale scene of the Internet. OpenKruise released the latest version of V0.9.0 (ChangeLog) on May 20, 2019. In the last article, we introduced the new Pod restart, delete protection and other important functions. Today, WE introduce another core feature. That is, SidecarSet has extended support specifically for Service Mesh scenarios based on the previous version.

Background: How do I upgrade a Mesh container independently

SidecarSet is Kruise’s workload for independent management of Sidecar containers. SidecarSet can facilitate automatic injection and independent upgrade of Sidecar containers. For details, please refer to the official website of OpenKruise

By default, Sidecar’s independent upgrade sequence is to stop the old version of the container and then create the new version. This approach is particularly suitable for Sidecar containers that do not affect the availability of Pod services, such as log collection Agents, but for many proxy or run-time Sidecar containers, such as Istio Envoy, this upgrade approach is problematic. Envoy serves as a Proxy container for all traffic in pods. In this scenario, the availability of Pod services will be affected if you directly restart and upgrade, so you need to consider the application’s own publishing and capacity, and cannot publish Sidecar completely independently of the application.

There are tens of thousands of PODS in Alibaba Group that communicate with each other based on Service Mesh. Upgrading the Mesh container will make Service Pod unavailable, so the upgrade of the Mesh container will greatly hinder the iteration of Service Mesh. For this scenario, we worked with the Service Mesh team within the group to achieve thermal upgrade capabilities for Mesh containers. This article focuses on how the SidecarSet plays an important role in implementing the thermal upgrade capability of the Mesh container.

SidecarSet assists in non-destructive hot upgrade of Mesh containers

A Mesh container cannot be upgraded in situ like a log collection container. The reasons are as follows: ** The Mesh container must provide external services continuously, and the independent upgrade mode causes the Mesh service to be unavailable for a period of time. ** There are well-known Mesh services in the community such as Envoy, Mosn and others that provide smooth upgrades by default, but these upgrades cannot be properly integrated with cloud native, and Kubernetes lacks an upgrade solution for such Sidecar containers.

The OpenKruise SidecarSet provides a Sidecar hot upgrade mechanism for Mesh containers, enabling a non-destructive hot upgrade for Mesh containers in the cloud native mode.

apiVersion: apps.kruise.io/v1alpha1
Copy the code

kind: SidecarSet metadata: name: hotupgrade-sidecarset spec: selector: matchLabels: app: hotupgrade containers: – name: sidecar image: openkruise/hotupgrade-sample:sidecarv1 imagePullPolicy: Always lifecycle: postStart: exec: command: – /bin/sh – /migrate.sh upgradeStrategy: upgradeType: HotUpgrade hotUpgradeEmptyImage: openkruise/hotupgrade-sample:empty

  • **upgradeType **: HotUpgrade indicates that the type of the sidecar container is Hot upgrade.
  • **HotUpgradeEmptyImage **: When hot upgrading a Sidecar container, the business needs to provide an empty container for container switching during the hot upgrade. Empty containers have the same configuration as Sidecar containers (with the exception of mirror addresses), such as Command, Lifecycle, probe, etc.

The SidecarSet hot upgrade mechanism includes the hot update of Sidecar container and smooth upgrade of Mesh container.

Inject a hot upgrade Sidecar container

For hot upcycled Sidecar containers, the SidecarSet Webhook will inject two containers at Pod creation time:

  • {Sidecar. Name} -1: Enlist-1 as shown below, this container represents the actual Sidecar container at work, e.g., envoy :1.16.0

  • {Sidecar. Name} -2: Enlist-2 as shown below, this container is the HotUpgradeEmptyImage container provided by the business, for example: Empty :1.0

The Empty container above does no actual work during the Mesh operation.

The Mesh container is smoothly upgraded

The hot upgrade process consists of the following three steps:

  1. Upgrade: Replace Empty with the latest version of Sidecar, for example: enlist-2. Image = envoy:1.17.0

  2. Migration: Executes the PostStartHook script of the Sidecar container to upgrade the mesh service smoothly

  3. Reset: After the Mesh service smooth upgrade, replace the old Sidecar container with Empty container, for example: envoy 1.image = Empty: 1.0

You only need to perform the preceding three steps to complete the hot upgrade process. If you perform multiple hot upgrades for the Pod, repeat the preceding three steps.

Migration core logic

The SidecarSet hot upgrade mechanism not only completes the switching of Mesh containers, but also provides the coordination mechanism of old and new versions (PostStartHook), but it is only the first step in a long journey. The Mesh container also needs to provide PostSartHook scripts to perform smooth upgrades of the Mesh service itself (the Migration process described above), such as Envoy hot restarts and Mosn nondestructive restarts.

Mesh containers generally provide external services by listening on fixed portsThe migration process of ESH container can be summarized as follows: Transfer ListenFD, stop Accpet, and start drainage through UDS. For Mesh containers that do not support hot restart, you can follow this procedure to modify the Mesh container. The logical diagram is as follows:

Hot upgrade Migration Demo

The external services and internal implementation logic provided by different Mesh containers are different, and the specific Migration is also different. The above logic is just a summary of some key points, and I hope it can be helpful to those who need it. We also provided a hot upgrade Migration Demo on Github for reference, and some of the key code is described below.

1. Negotiation mechanism

The Mesh container startup logic first needs to determine the first startup or the smooth migration process of hot upgrade. In order to reduce the communication cost of the Mesh container, Kruise injects two environment variables **SIDECARSET_VERSION and SIDECARSET_VERSION_ALT into the two Sidecar containers. ** Determine whether it is a hot upgrade process and whether the current sidecar container is a new or old version by determining the values of two environment variables. // return two parameters: // 1. (bool) indicates whether it is hot upgrade process // 2. (bool ) when isHotUpgrading=true, the current sidecar is newer or older func isHotUpgradeProcess() (bool, Bool) {// Current Sidecar version version := os.Getenv(“SIDECARSET_VERSION”) // Peer Sidecar version versionAlt := Os.getenv (“SIDECARSET_VERSION_ALT”) // If the version of the peer sidecar container is “0”, If versionAlt == “0” {return false, false} // versionInt during the hot upgrade, _ := strconv.atoi (version) versionAltInt, _ := strconv.atoi (versionAlt) // Version is a monotonically increasing int, Return true, versionInt > versionAltInt}

2. ListenFD migration

** Realize the migration of ListenFD between different containers through Unix Domain Socket. ** This step is also a key step in hot upgrade. The code example is as follows:

For code brevity, all failures will not be caught

/* / tcpLn * net.tcplistener f, _ := tcpln.file () fdnum := f.d () data := syscall.unixRights (int(fDNum)) // Establish a link with the new sidecar container through the Unix Domain Socket Addr, _ := net.resolxAddr (” Unix “, “/dev/ SHM /migrate. Sock “) uds, _ := net.dialunix (” Unix “, nil, raddr) ListenFD to the new sidecar container UDs.writemsgUNIX (nil, data, nil) // Stop receiving new Requests and start draining, e.g. Http2 GOAWAY tcpln.close ()

/* New version of sidecar ListenFD */ // monitor UDS addr, _ := net.resolveUnixaddr (” Unix “, “/dev/ SHM /migrate. Sock “) unixLn, _ := net.ListenUnix(“unix”, addr) conn, _ := unixLn.AcceptUnix() buf := make([]byte, 32) oob := make([]byte, ListenFD _, ooBN, _, _, _ := conn.ReadMsgUnix(buf, oob) SCMS, _ : = the syscall. ParseSocketControlMessage (oob [: oobn]) if len (SCMS) > 0 {/ / parsing FD, * net.tcplistener FDS, _ := syscall.parseUnixRights (&(SCMS [0]) _ := net.filelistener (f) tcpLn, _ := ln.(* net.tcplistener) // Start to provide external services based on the received Listener. HTTP.Serve(tcpLn, serveMux)}

The Mesh container hot upgrade case is known

Alibaba Cloud Service Mesh (ASM for short) provides a fully hosted Service grid platform, compatible with the community Istio open source Service grid. Currently, ** based on OpenKruise SidecarSet’s hot upgrade capability, ASM has implemented the Data plane Sidecar hot upgrade capability (Beta). Users can complete the data plane version upgrade of the service grid without application sensitivity. The official version will be available soon. ** In addition to the hot upgrade capability, ASM also supports configuration diagnosis, operation audit, access log, monitoring, service registration access and other capabilities, comprehensively improving the experience of using the service grid, welcome to try.

To sum up, the hot upgrade of Mesh containers in cloud native has always been an urgent but tricky issue. The scheme in this paper is only an exploration of alibaba Group on this issue. While giving feedback to the community, we also hope to inspire you to think about this scenario. At the same time, we also welcome more students to participate in the OpenKruise community, and jointly build a more rich and perfect K8s application management and delivery expansion ability, which can be oriented to more large-scale, complicated and extreme performance scenarios.

  • Hot upgrade Migration Demo:Github.com/openkruise/…
  • Making:Github.com/openkruise/…
  • Official:openkruise.io/
  • Nail communication group: