preface

Tkex-csig is an internal cloud container service platform developed based on TKE and EKS container services of Tencent Public Cloud. It provides a cloud native platform to solve the internal cloud on containers of the company. It is characterized by compatibility with cloud native, adaptation to self-research business and open source collaboration.

When a service container is connected to the cloud, some problems may be encountered. Some need to be containerized, and some need to be enabled by the platform. In terms of platform empowerment, there are some problems that have solutions under CVM scenarios but are incompatible on Kubernetes platform due to different operation and maintenance methods, such as Pod pre-authorization. We hope to solve this problem in a cloud native way and provide platformization capability, so that every user can conveniently deploy and manage their business on the platform.

background

How do I pre-authorize new devices for service deployment or capacity expansion? I believe you are familiar with this problem. For security reasons, important internal components and storage often carry out source control on access requests, such as IP access authorization of CDB, module authorization of OIDB and VASKEY command words. They either have their own authorization WEB that users can request for bills of lading, or they provide authorization apis that o&M platforms can call. However, the routing system often needs to accurately obtain the regional information of IP devices to provide the ability to access the nearest area during the discovery and registration, which requires pre-registration of CMDB.

In the past, when we used CVM/TVM to deploy services, this problem was easier to deal with, because we had obtained a virtual machine in advance, had allocated IP and registered CMDB, and what we needed to do was to use this IP to issue bill of lading authorization, deploy business programs, and add routing online after everything was complete. This process can be automated with the pipelining capabilities of the o&M platform.

Different from the step-by-step process deployment of VM after getting available devices, Kubernetes manages the entire life cycle of Pod from production, IP allocation, business container startup and routing maintenance, which is automatically managed by the Control Loop of multiple system controllers. The mirrored deployment ensures the consistency of business instances, and the destruction and reconstruction of PODS become normal. IP can not be fixed.

Businesses often face a variety of pre-authorization requirements, with the average authorization time ranging from seconds to several minutes. Most of the authorization apis are not designed to carry high QPS, resulting in certain complexity. We need to be able to find a way to process authorization before the business container is up after the Pod IP is allocated, block and secure the subsequent process, and control the stress of the rebuilding process on the authorization API.

After design and iterative optimization, the TKEX-CSIG platform provides easy-to-use authorization capabilities for products, making it easy to deal with such Pod pre-authorization issues.

Architecture and capability analysis

architecture

The authorization system architecture is shown in the figure above. The core idea is to use init Container to perform complex logic preprocessing before service Pod starts. The official definition of Init Container is as follows

This page provides an overview of init containers: specialized containers that run before app containers in a Pod. Init containers can contain utilities or setup scripts not present in an app image
Copy the code

For small scale or single business solutions, we can simply inject init Container into business Worklooad YAML and call the required authorization API implementation. To implement platform productization capabilities, we also need to consider the following points:

  • Easy to use and maintainable

    Service efficiency and manageability should be fully considered, and rights should be managed by platform records as a resource to reduce the impact of changes on services.

  • Frequency limiting and self-healing

    Permissions apis are often not designed for high QPS and need to restrict calls to protect downstream.

  • Access convergence

    Security, Pod destruction reconstruction may lead to IP changes, consider actively reclaim expired permissions

Authorization process product competency

Services only need to register required permission resources on the platform WEB console, configure permission groups, and associate them to Workload. The platform automatically injects the configuration of Init Container, transmits the authorization configuration index and related information through the ENV, and performs the authorization process when Pod is created. Several components involved in the authorization process are designed as follows:

  • init-action-client

    Init Container, as a trigger device, does only one thing, is to make HTTP call requests, remain immutable, so that when functionality is iterated you don’t have to modify the yamL of the business, and the main logic is moved back

  • init-action-server

    The deployment can be extended horizontally, perform the pre-processing logic, pre-register CMDB and other operations, and initiate pipelining calls, start the application process of permission and polling query, and expose the process information associated with POD to facilitate business self-examination and administrator locating problems. The fallback retry and circuit breaker logic mentioned later is also implemented here.

  • PermissionCenter

    A platform management component that is located outside the cluster and is responsible for storing and applying for permission resources. Contains a permission resource center, which facilitates reuse of permission details of storage service registration, provides permission Set group management, and simplifies parameter transfer during authorization. The producer/consumer pattern is used to implement the invocation of the authorization API and the result query based on Pipline.

Circuit breakers and retreat retry mechanisms

Many exceptions may occur during the authorization process, such as incorrect configuration of permission parameters, deterioration or unavailability of the authorization API service quality, and even interface errors or timeouts caused by network reasons. Authorization apis are often not designed to support high QPS, and we use timeout retries, plus circuit breakers, and exponential backoff retries for fault tolerance.

  • Timeout retry

    This is reflected in the timeout setting and retry mechanism of interface calls and asynchronous tasks. In case of instantaneous failure, init-action-client container will be rebuilt if it exits abnormally. Each creation is a new round of retry.

  • The circuit breaker

    A Configmap is used to record the number of FAILED Pod permission requests in the cluster. After three failed attempts, no application is granted. It also provides a reset capability that is exposed to the front end so that users and administrators can easily retry.

  • Index retreat

    The circuit breaker mode blocks cases where user configuration errors never succeed, but it does not deal with long-term transient failures. For example, during the cancellation period, there may be a period of denial of service at the back end of the authorization API, from 10 minutes to several hours. At this time, a large number of Pod authorization will hit the breaker rules and cannot continue to be authorized. For each Pod, we added an exponential backoff with jitter and recorded the latest failure timestamp, which allowed one attempt after a period of time. If successful, the backoff of the specified Pod would be reset. If unsuccessful, the timestamp would be updated and the timing would be restarted.

bk := &PodBreaker{ NamespacePod: namespacePod, LastRequestFailTime: time.Now(), Backoff: wait.Backoff{ Duration: 2 * time.Minute, Factor: 2.0, Jitter: 1.0, Steps: 5, Cap: 1 * time.Hour,},}Copy the code

Finalizer Convergence permission

Convergence of permissions is often ignored, but security also needs to be considered. The destruction and reconstruction of Pod may be a normal situation, IP reference is not allowed to change dynamically, a large number of garbage permissions may be generated over a long period of time, or the authorized IP may be allocated to other business pods, resulting in security risks. We made a Finalizer controller to reclaim the permission before the Pod destruction. The reclaim action is idempotent and we do our best, because the reclaim ability also depends on whether the permission party has the reclaim ability. We will consider this for the permission of new connection, such as the IP automatic authorization of Tencent cloud MySQL.

In order to reduce the action of Finalizer and minimize the impact on unauthorized pods, we only identify the Pod with authorized Init Container when the Pod change event is performed. The Finalizer tag on the Patch. The permission is reclaimed and the Finalizer is removed when the Pod is scaled down, and the GC deletes the Pod.

kind: Pod
metadata:
  annotations:
~
  creationTimestamp: "2020-11-13T09:16:52Z"
  finalizers:
  - stke.io/podpermission-protection
Copy the code

conclusion

This paper deals with the pre-processing of business processes such as automatic authorization when business uses container platform. Init Container is used to preprocess service containers before they are started, and authorization features are enabled to facilitate service management and application for permission resources. Circuit breakers and the retreat retry mechanism provide fault tolerance. Finalizer provides the ability to reclaim services to prevent permission proliferation.

Refer to the article

  • Init Containers
  • Retry, timeout, and retreat
  • Using Finalizers