Secret is especially important in Kubernetes. Because it is the object in K8s where all sensitive information is stored. The sensitive information includes passwords, cluster certificates, OAuth tokens, SSH keys and other sensitive files customized by users. Therefore, once Secret has a security problem in K8s, the consequences will be very serious. In addition, although the community has provided some safety protection programs, there are still many problems.

What security issues does K8s Secret face? What are the implications of these security issues? What are the shortcomings of the solutions provided by the community? . To answer these questions, InfoQ spoke with Ant Group senior engineer Karen Qin, who specializes in trusted computing, system security and virtualization and has in-depth research and exploration of K8s Secret.

K8s Secret security issues

According to the Kubernetes documentation, Secret is the object in K8s where all sensitive information is stored. In fact, if sensitive information is directly stored in the Pod Spec or image of K8s, it is not only difficult to control, but also poses great security risks. Therefore, K8s can better control the use of sensitive information and reduce the risk of accidental exposure by creating, managing and applying Secret objects.

While the introduction of the K8s Secret object has somewhat reduced the risk of accidental disclosure (more through centralized management), the security of the K8s Secret object itself, “there are still many security issues in the default community scheme,” Qin said.

In general, in K8s, Secret data is stored in etCD as plain text, only base64 encoded by default and not encrypted. At the same time, share this file or check it into the code base, the password is easy to leak.

Lack of community solutions

To address this issue, the K8s community provides KMS based K8s Secret encryption scheme, which is supported by Google Cloud, AWS, and Azure. “This fixes the Secret plaintext storage problem in ETCD, but it still has some issues,” he said.

  • Secret and encryption The Secret key is stored in the memory in plain text and is easy to be breached.
  • An attacker can impersonate a legitimate user, call the decryption interface and steal the key.

Once the key is leaked, all data will be leaked and users’ trust in the whole system will collapse. “To this end, the community and some companies are trying to add hardware-based security protection to the Plugin in the solution to make it more difficult to attack. But for certain users, the coverage and level of protection is still insufficient.”

In fact, we can look at the entire life cycle of K8s Secret:

  • Secret generation and access The Secret identity certificate is stored in the user side memory in plaintext. The user side environment is complex and vulnerable to attackers.
  • The generation and cache of Secret are stored in the memory in plain text in K8s API server, and the security root is easy to be stolen or damaged.
  • The encryption and decryption interface of Plugin interacting with KMS cannot prevent the attacker from impersonating, and there is a risk of leakage.
  • Secret is consumed in Node, but it is still stored in plain memory, exposing certain attack surface.

In Qin’s opinion, ideally, the protection of Secret in K8s should consider the safety and reliability of its entire life cycle, so as to achieve end-to-end security protection.

Ant Group’s exploration

To this end, they protected key components and steps in K8s Secret’s entire life cycle and end-to-end use process based on TEE technology. The overall plan is roughly as follows:

  • TEE is used to protect the KMS Plugin that interacts with KMS on the API Server side, which reduces the performance overhead on the premise that the root key (security root) and data encryption key in the Plugin have no leakage risk.
  • The KMS provider on the API Server side is protected by TEE to avoid the data key and Secret being directly exposed in the memory in plaintext at any time. At the same time, the local proof mechanism of TEE can authenticate the caller of the data key interface, prevent the attacker from impersonating, and ensure the security of the key.
  • The client kubectl, Kubeconfig using TEE protection, on the one hand kubeconfig does not fall disk at the same time by hardware protection, improve the security level; On the other hand, Secret of the user is processed through the secure channel through TEE, avoiding direct exposure to memory and avoiding the risk of malicious theft. Besides, the user performs remote proof of API Server TEE. It can help users make sure that they are trusting their Secret to trusted software entities (without malicious logic that deliberately reveals user secrets) and build trust in API Server.
  • The consumption process of Secret in Kubelet of Node terminal is protected by TEE, which further avoids the direct exposure of Secret in memory and avoids the risk of malicious theft.

“The solution is based on TEE’s end-to-end K8s Secret protection and introduces LibOS technology to make TEE completely transparent to users, developers and operations teams,” Qin told InfoQ.

The KMS Plugin and tee-based KMS Plugin do not have a standard and open source community implementation, so they designed and developed their own KMS Plugin and made production enhancements in grayscale publishing, emergency handling, monitoring management and other aspects. In conjunction with TEE, we provide standalone and serve-based KMS Plugin solutions to address performance issues with SGX models.

Similarly, tee-based Kubectl has no standard and open source community implementation, he says: “We developed our own security Kubectl based on KubeProxy, and achieved kubeconFig transparent to the user, with the user identity binding, do not drop disk and use TEE to protect memory security and other design goals.

In addition, considering the ease of use, reliability, scalability and maintainability of TEE protection, after evaluating several schemes, they introduced Occlum LibOS, which was open-source by ants, shielding the impact of TEE on users, developers and operation and maintenance teams, greatly reducing the threshold and cost of TEE development.

It seems to Qin Kailun K8s as ants large-scale container cluster control foundation, application based on the TEE end-to-end K8s Secret protection protection scheme, enhances its security and credible, improve the safety level of the ants core control plane, “for the financial scenario high standards for data security and privacy protection indispensable”.

K8s related reading

  • Prepare for Double 11! How to design ant Financial K8s cluster management system?
  • Kubernetes: distributed operating system with microkernel
  • Java Kubernetes Operator runtime out of the box
  • Go deep into Kubernetes’ “no man’s land” – Ant Financial’s double 11 scheduling system
  • Depth | ant gold service automation practice of large-scale Kubernetes cluster operations