Debugging DNS Resolution Issue in Kubernetes
I've suggested you to follow this so we could isolate possible problems in your CoreDNS and as you can see it's working fine.
Reaching individual pod if it's not statefulset seems not very important at least in my k8s usage (could be for everyone).
It's possible to reach a pod using a DNS record but as you stated it's not very important on regular K8s implementations.
When enabled, pods are assigned a DNS A record in the form of
pod-ip-address.my-namespace.pod.cluster.local
.For example, a pod with IP
1.2.3.4
in the namespacedefault
with a DNS name ofcluster.local
would have an entry:1-2-3-4.default.pod.cluster.local
. Source
EXAMPLE
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dnsutils 1/1 Running 20 20h 10.28.2.3 gke-lab-default-pool-87c6b085-wcp8 <none> <none>
sample-pod 1/1 Running 0 2m11s 10.28.2.4 gke-lab-default-pool-87c6b085-wcp8 <none> <none>
$ kubectl exec -ti dnsutils -- nslookup 10-28-2-4.default.pod.cluster.local
Server: 10.31.240.10
Address: 10.31.240.10#53
Name: 10-28-2-4.default.pod.cluster.local
Address: 10.28.2.4
Funny fact, but maybe this is how Kubernetes should work?
Yes, your CoreDNS is working as intended and everything you described is expected.
Related videos on Youtube
laimison
Updated on September 18, 2022Comments
-
laimison over 1 year
I have built a Kubernetes cluster using Kubespray on Ubuntu 18.04 and facing DNS issue so basically containers cannot communicate through their hostnames.
Things that are working:
- containers communication through IP addresses
- internet is working from the container
- able to resolve
kubernetes.default
Kubernetes master:
root@k8s-1:~# cat /etc/resolv.conf | grep -v ^\\# nameserver 127.0.0.53 search home root@k8s-1:~#
Pod:
root@k8s-1:~# kubectl exec dnsutils cat /etc/resolv.conf nameserver 169.254.25.10 search default.svc.cluster.local svc.cluster.local cluster.local home options ndots:5 root@k8s-1:~#
CoreDNS pods are healthy:
root@k8s-1:~# kubectl get pods --namespace=kube-system -l k8s-app=kube-dns NAME READY STATUS RESTARTS AGE coredns-58687784f9-8rmlw 1/1 Running 0 35m coredns-58687784f9-hp8hp 1/1 Running 0 35m root@k8s-1:~#
Logs for CoreDNS pods:
root@k8s-1:~# kubectl describe pods --namespace=kube-system -l k8s-app=kube-dns | tail -n 2 Normal Started 35m kubelet, k8s-2 Started container coredns Warning DNSConfigForming 12s (x33 over 35m) kubelet, k8s-2 Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 4.2.2.1 4.2.2.2 208.67.220.220 root@k8s-1:~# kubectl logs --namespace=kube-system coredns-58687784f9-8rmlw .:53 2020-02-09T22:56:14.390Z [INFO] plugin/reload: Running configuration MD5 = b9d55fc86b311e1d1a0507440727efd2 2020-02-09T22:56:14.391Z [INFO] CoreDNS-1.6.0 2020-02-09T22:56:14.391Z [INFO] linux/amd64, go1.12.7, 0a218d3 CoreDNS-1.6.0 linux/amd64, go1.12.7, 0a218d3 root@k8s-1:~# root@k8s-1:~# kubectl logs --namespace=kube-system coredns-58687784f9-hp8hp .:53 2020-02-09T22:56:20.388Z [INFO] plugin/reload: Running configuration MD5 = b9d55fc86b311e1d1a0507440727efd2 2020-02-09T22:56:20.388Z [INFO] CoreDNS-1.6.0 2020-02-09T22:56:20.388Z [INFO] linux/amd64, go1.12.7, 0a218d3 CoreDNS-1.6.0 linux/amd64, go1.12.7, 0a218d3 root@k8s-1:~#
CoreDNS seems exposed:
root@k8s-1:~# kubectl get svc --namespace=kube-system | grep coredns coredns ClusterIP 10.233.0.3 <none> 53/UDP,53/TCP,9153/TCP 37m root@k8s-1:~# root@k8s-1:~# kubectl get ep coredns --namespace=kube-system NAME ENDPOINTS AGE coredns 10.233.64.2:53,10.233.65.3:53,10.233.64.2:53 + 3 more... 37m root@k8s-1:~#
These are my problematic pods - all cluster affected because of this issue:
root@k8s-1:~# kubectl get pods -o wide -n default NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES busybox 1/1 Running 0 17m 10.233.66.7 k8s-3 <none> <none> dnsutils 1/1 Running 0 50m 10.233.66.5 k8s-3 <none> <none> nginx-86c57db685-p8zhc 1/1 Running 0 43m 10.233.64.3 k8s-1 <none> <none> nginx-86c57db685-st7rw 1/1 Running 0 47m 10.233.66.6 k8s-3 <none> <none> root@k8s-1:~#
Able to reach internet using DNS and container through IP address:
root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping 10.233.64.3" PING 10.233.64.3 (10.233.64.3) 56(84) bytes of data. 64 bytes from 10.233.64.3: icmp_seq=1 ttl=62 time=0.481 ms 64 bytes from 10.233.64.3: icmp_seq=2 ttl=62 time=0.551 ms ... root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping google.com" PING google.com (172.217.21.174) 56(84) bytes of data. 64 bytes from fra07s64-in-f174.1e100.net (172.217.21.174): icmp_seq=1 ttl=61 time=77.9 ms ... root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping kubernetes.default" PING kubernetes.default.svc.cluster.local (10.233.0.1) 56(84) bytes of data. 64 bytes from kubernetes.default.svc.cluster.local (10.233.0.1): icmp_seq=1 ttl=64 time=0.030 ms 64 bytes from kubernetes.default.svc.cluster.local (10.233.0.1): icmp_seq=2 ttl=64 time=0.069 ms ...
Actual issue:
root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping nginx-86c57db685-p8zhc" ping: nginx-86c57db685-p8zhc: Name or service not known command terminated with exit code 2 root@k8s-1:~# root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping dnsutils" ping: dnsutils: Name or service not known command terminated with exit code 2 root@k8s-1:~# oot@k8s-1:~# kubectl exec -ti busybox -- nslookup nginx-86c57db685-p8zhc Server: 169.254.25.10 Address: 169.254.25.10:53 ** server can't find nginx-86c57db685-p8zhc.default.svc.cluster.local: NXDOMAIN *** Can't find nginx-86c57db685-p8zhc.svc.cluster.local: No answer *** Can't find nginx-86c57db685-p8zhc.cluster.local: No answer *** Can't find nginx-86c57db685-p8zhc.home: No answer *** Can't find nginx-86c57db685-p8zhc.default.svc.cluster.local: No answer *** Can't find nginx-86c57db685-p8zhc.svc.cluster.local: No answer *** Can't find nginx-86c57db685-p8zhc.cluster.local: No answer *** Can't find nginx-86c57db685-p8zhc.home: No answer command terminated with exit code 1 root@k8s-1:~#
Am I missing something or how to fix communication between containers using hostnames?
Many thanks
Updated
More checks:
root@k8s-1:~# kubectl exec -ti dnsutils -- nslookup kubernetes.default Server: 169.254.25.10 Address: 169.254.25.10#53 Name: kubernetes.default.svc.cluster.local Address: 10.233.0.1
I have created StatefulSet:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/website/master/content/en/examples/application/web/web.yaml
An I'm able to ping service "nginx":
root@k8s-1:~/kplay# k exec dnsutils -it nslookup nginx Server: 169.254.25.10 Address: 169.254.25.10#53 Name: nginx.default.svc.cluster.local Address: 10.233.66.8 Name: nginx.default.svc.cluster.local Address: 10.233.64.3 Name: nginx.default.svc.cluster.local Address: 10.233.65.5 Name: nginx.default.svc.cluster.local Address: 10.233.66.6
Also able to contact statefulset members when using FQDN
root@k8s-1:~/kplay# k exec dnsutils -it nslookup web-0.nginx.default.svc.cluster.local Server: 169.254.25.10 Address: 169.254.25.10#53 Name: web-0.nginx.default.svc.cluster.local Address: 10.233.65.5 root@k8s-1:~/kplay# k exec dnsutils -it nslookup web-1.nginx.default.svc.cluster.local Server: 169.254.25.10 Address: 169.254.25.10#53 Name: web-1.nginx.default.svc.cluster.local Address: 10.233.66.8
But not using just hostnames:
root@k8s-1:~/kplay# k exec dnsutils -it nslookup web-0 Server: 169.254.25.10 Address: 169.254.25.10#53 ** server can't find web-0: NXDOMAIN command terminated with exit code 1 root@k8s-1:~/kplay# k exec dnsutils -it nslookup web-1 Server: 169.254.25.10 Address: 169.254.25.10#53 ** server can't find web-1: NXDOMAIN command terminated with exit code 1 root@k8s-1:~/kplay#
All of them are living in the same namespace:
root@k8s-1:~/kplay# k get pods -n default NAME READY STATUS RESTARTS AGE busybox 1/1 Running 22 22h dnsutils 1/1 Running 22 22h nginx-86c57db685-p8zhc 1/1 Running 0 22h nginx-86c57db685-st7rw 1/1 Running 0 22h web-0 1/1 Running 0 11m web-1 1/1 Running 0 10m
Another test which confirms that I'm able to ping services:
kubectl create deployment --image nginx some-nginx kubectl scale deployment --replicas 2 some-nginx kubectl expose deployment some-nginx --port=12345 --type=NodePort root@k8s-1:~/kplay# k exec dnsutils -it nslookup some-nginx Server: 169.254.25.10 Address: 169.254.25.10#53 Name: some-nginx.default.svc.cluster.local Address: 10.233.63.137
Final thoughts
Funny fact, but maybe this is how Kubernetes should work? I'm able to reach service hostname and statefulset members if wanted to reach some pod individually. Reaching individual pod if it's not statefulset seems not very important at least in my k8s usage (could be for everyone).
-
laimison about 4 yearsThanks @mWatney for your answer. That is an interesting story on how I spent 2 days on an issue that doesn't exist. Welcome to Kubernetes world :)
-
Mark Watney about 4 yearsGlad to be able to confirm your assumptions. Welcome to Kubernetes world!