Due to a bug in the Linux kernel, you are likely to encounter the annoying DNS intermittent 5-second delay in Kubernetes clusters (community issue #56903[1]). Although the issue has been closed, it does not mean that the issue has been completely resolved, so we need to be careful to work around this bug when managing and maintaining the Kubernetes cluster.

Why is there intermittent DNS latency

Why do I encounter this intermittent 5 delay in a Kubernetes cluster? Weave Works has published a blog Racy Conntrack and DNS Lookup Timeouts [2] detailing the cause of the problem.

Simply put, because UDP is connectionless, the kernel NetFilter module can have three competing problems when handling concurrent UDP packets on the same socket. Take the following ConnTrack and DNAT workflow as an example:

Because a UDP CONNECT system call does not create conntrack records immediately, but only after the UDP packet has been sent, this can cause three problems:

  1. Two UDP packets in the first stepnf_conntrack_inThe conntrack record is not found in either, so two different packages will create the same Conntrack record (note that quintuples are the same).
  2. A UDP packet has not been called yetget_unique_tupleThe Conntrack record has been confirmed by another UDP packet.
  3. Two UDP packets are inipt_do_tableSelect DNAT rules for two different endpoints.

All three scenarios cause the last __nf_conntrack_confirm step to fail, and a UDP packet is discarded. Since both the GNU C library and the Musl libc library issue BOTH A and AAAA DNS queries when querying DNS, the problem of one of the packages being dropped can occur due to kernel contention issues described above. After the drop, the client will retry out of time, usually 5 seconds.

The third issue above has not been fixed yet, while the first two issues have been fixed and are included in 5.0 and 4.19 respectively:

  1. Netfilter: nF_NAT: skip NAT clash resolution for same-origin entries[3] (included in kernel V5.0)
  2. Netfilter: nF_ConnTrack: Resolve Clash for Matching ConnTracks [4] (included in Kernel V4.19)

In the public cloud, these patches may also be included in older kernel releases. For example, on Azure, these two issues are already covered in V4.15.0-1030.31 and V4.18.0-1006.6.

How can this problem be avoided

To avoid DNS latency, you need to work around these three problems, so here are a few ways:

  1. Disallow concurrent DNS queries, such as in Pod configurationsingle-request-reopenOption forces A query to use the same socket as AAAA query:
dnsConfig:
  options:
    - name: single-request-reopen
Copy the code
  1. Disable IPv6 to avoid AAAA queries, such as Grub configurationipv6.disable=1To disable ipv6 (the node needs to be restarted to take effect).
  2. Use TCP, for example in Pod configurationuse-vcOption to force DNS queries using TCP:
dnsConfig:
  options:
    - name: single-request-reopen
    - name: ndots
      value: "5"
    - name: use-vc
Copy the code
  1. With Nodelocal DNS Cache[5], all Pod DNS queries are queried through the local DNS Cache, avoiding DNAT and thus avoiding competition problems in the kernel. You can deploy it by executing the following command (note that it will modify the Kubelet configuration and restart Kubelet) :
kubectl apply -f https://github.com/feiskyer/kubernetes-handbook/raw/master/examples/nodelocaldns/nodelocaldns-kubenet.yaml
Copy the code

The resources

  • [1] 56903: Github.com/kubernetes/…
  • [2] Racy conntrack and DNS lookup timeouts: www.weave.works/blog/racy-c…
  • [3] netfilter: nf_nat: skip nat clash resolution for same-origin entries: Git.kernel.org/pub/scm/lin…
  • [4] netfilter: nf_conntrack: resolve clash for matching conntracks: Git.kernel.org/pub/scm/lin…
  • [5] Nodelocal DNS Cache: Github.com/kubernetes/…

Welcome to pay attention to chat cloud native public number, learn more cloud native knowledge.