Kubernetes calico node CrashLoopBackOff

10,416

Solution 1

I had this issue fixed. In my case the issue was due to same IP address being used by both Master and Worker-Node.

I created 2 Ubuntu-VMs.1 VM for Master K8S and the other VM for worker-node. Each VM was configured with 2 NAT and 2 Bridge interfaces. The NAT interfaces were generating same IP addresses in both the VMs.

enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.0.2.15  netmask 255.255.255.0  broadcast 10.0.2.255
        inet6 fe80::a00:27ff:fe15:67e  prefixlen 64  scopeid 0x20<link>
        ether 08:00:27:15:06:7e  txqueuelen 1000  (Ethernet)
        RX packets 1506  bytes 495894 (495.8 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1112  bytes 128692 (128.6 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Now, when I used the below commands to create Calico-Node, both the Master and Worker node use the same interface/IP i.e. enp0s3

sudo kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml

sudo kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml

How did I know:

Check the log files under the following directories and try to figure out if the nodes use same IP addresses.

/var/log/container/
/var/log/pod/<failed_pod_id>/

How to resolve:

Make sure both Master and Worker Node use different IP. You can either disable NAT in the VM or use a "static and unique" IP address. Then reboot the system.

Solution 2

My problem was related to firewalld:

 Normal   Started                8m (x3 over 8m)   kubelet, worker-node2  Started container
 Normal   Created                8m (x3 over 8m)   kubelet, worker-node2  Created container
 Normal   Pulled                 8m (x2 over 8m)   kubelet, worker-node2  Container image "quay.io/calico/node:v3.0.3" already present on machine
 Warning  Unhealthy              8m (x2 over 8m)   kubelet, worker-node2  Readiness probe failed: Get http://10.0.1.102:9099/readiness: dial tcp 10.0.1.102:9099: getsockopt: connection refused
 Warning  BackOff                4m (x21 over 8m)  kubelet, worker-node2  Back-off restarting failed container

Of which firewall-cmd --permanent --add-port=9099/tcp and restarting the pod solved

Share:
10,416
Admin
Author by

Admin

Updated on June 09, 2022

Comments

  • Admin
    Admin almost 2 years

    While there are some questions just like mine out there, the fixes do not work for me. I'm using the kubernetes v1.9.3 binaries and using flannel and calico to setup a kubernetes cluster. After applying calico yaml files it gets stuck on creating the second pod. What am I doing wrong? The logs aren't really clear in saying what's wrong

    kubectl get pods --all-namespaces

    root@kube-master01:/home/john/cookem/kubeadm-ha# kubectl logs calico-node-
    n87l7 --namespace=kube-system
    Error from server (BadRequest): a container name must be specified for pod 
    calico-node-n87l7, choose one of: [calico-node install-cni]
    root@kube-master01:/home/john/cookem/kubeadm-ha# kubectl logs calico-node-
    n87l7 --namespace=kube-system install-cni
    Installing any TLS assets from /calico-secrets
    cp: can't stat '/calico-secrets/*': No such file or directory
    

    kubectl describe pod calico-node-n87l7 returns

    Name:         calico-node-n87l7
    Namespace:    kube-system
    Node:         kube-master01/10.100.102.62
    Start Time:   Thu, 22 Feb 2018 15:21:38 +0100
    Labels:       controller-revision-hash=653023576
                  k8s-app=calico-node
                  pod-template-generation=1
    Annotations:  scheduler.alpha.kubernetes.io/critical-pod=
                  scheduler.alpha.kubernetes.io/tolerations=[{"key": "dedicated", "value": "master", "effect": "NoSchedule" },
     {"key":"CriticalAddonsOnly", "operator":"Exists"}]
    
    Status:         Running
    IP:             10.100.102.62
    Controlled By:  DaemonSet/calico-node
    Containers:
      calico-node:
        Container ID:   docker://6024188a667d98a209078b6a252505fa4db42124800baaf3a61e082ae2476147
        Image:          quay.io/calico/node:v3.0.1
        Image ID:       docker-pullable://quay.io/calico/node@sha256:e32b65742e372e2a4a06df759ee2466f4de1042e01588bea4d4df3f6d26d0581
        Port:           <none>
        State:          Running
          Started:      Thu, 22 Feb 2018 15:21:40 +0100
        Ready:          True
        Restart Count:  0
        Requests:
          cpu:      250m
        Liveness:   http-get http://:9099/liveness delay=10s timeout=1s period=10s #success=1 #failure=6
        Readiness:  http-get http://:9099/readiness delay=0s timeout=1s period=10s #success=1 #failure=3
        Environment:
          ETCD_ENDPOINTS:                     <set to the key 'etcd_endpoints' of config map 'calico-config'>  Optional: false
          CALICO_NETWORKING_BACKEND:          <set to the key 'calico_backend' of config map 'calico-config'>  Optional: false
          CLUSTER_TYPE:                       k8s,bgp
          CALICO_DISABLE_FILE_LOGGING:        true
          CALICO_K8S_NODE_REF:                 (v1:spec.nodeName)
          FELIX_DEFAULTENDPOINTTOHOSTACTION:  ACCEPT
          CALICO_IPV4POOL_CIDR:               10.244.0.0/16
          CALICO_IPV4POOL_IPIP:               Always
          FELIX_IPV6SUPPORT:                  false
          FELIX_LOGSEVERITYSCREEN:            info
          FELIX_IPINIPMTU:                    1440
          ETCD_CA_CERT_FILE:                  <set to the key 'etcd_ca' of config map 'calico-config'>    Optional: false
          ETCD_KEY_FILE:                      <set to the key 'etcd_key' of config map 'calico-config'>   Optional: false
          ETCD_CERT_FILE:                     <set to the key 'etcd_cert' of config map 'calico-config'>  Optional: false
          IP:                                 autodetect
          IP_AUTODETECTION_METHOD:            can-reach=10.100.102.0
          FELIX_HEALTHENABLED:                true
        Mounts:
          /calico-secrets from etcd-certs (rw)
          /lib/modules from lib-modules (ro)
          /var/run/calico from var-run-calico (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-p7d9n (ro)
      install-cni:
        Container ID:  docker://d9fd7a0f3fa9364c9a104c8482e3d86fc877e3f06f47570d28cd1b296303a960
        Image:         quay.io/calico/cni:v2.0.0
        Image ID:      docker-pullable://quay.io/calico/cni@sha256:ddb91b6fb7d8136d75e828e672123fdcfcf941aad61f94a089d10eff8cd95cd0
        Port:          <none>
        Command:
          /install-cni.sh
        State:          Waiting
          Reason:       CrashLoopBackOff
        Last State:     Terminated
          Reason:       Error
          Exit Code:    1
          Started:      Thu, 22 Feb 2018 15:53:16 +0100
          Finished:     Thu, 22 Feb 2018 15:53:16 +0100
        Ready:          False
        Restart Count:  11
        Environment:
          CNI_CONF_NAME:       10-calico.conflist
          ETCD_ENDPOINTS:      <set to the key 'etcd_endpoints' of config map 'calico-config'>      Optional: false
          CNI_NETWORK_CONFIG:  <set to the key 'cni_network_config' of config map 'calico-config'>  Optional: false
        Mounts:
          /calico-secrets from etcd-certs (rw)
          /host/etc/cni/net.d from cni-net-dir (rw)
          /host/opt/cni/bin from cni-bin-dir (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-p7d9n (ro)
    Conditions:
      Type           Status
      Initialized    True
      Ready          False
      PodScheduled   True
    Volumes:
      lib-modules:
        Type:          HostPath (bare host directory volume)
        Path:          /lib/modules
        HostPathType:
      var-run-calico:
        Type:          HostPath (bare host directory volume)
        Path:          /var/run/calico
        HostPathType:
      cni-bin-dir:
        Type:          HostPath (bare host directory volume)
        Path:          /opt/cni/binenter code here
        HostPathType:
      cni-net-dir:
        Type:          HostPath (bare host directory volume)
        Path:          /etc/cni/net.d
        HostPathType:
      etcd-certs:
        Type:        Secret (a volume populated by a Secret)
        SecretName:  calico-etcd-secrets
        Optional:    false
      calico-node-token-p7d9n:
        Type:        Secret (a volume populated by a Secret)
        SecretName:  calico-node-token-p7d9n
        Optional:    false
    QoS Class:       Burstable
    Node-Selectors:  <none>
    Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                     node.kubernetes.io/memory-pressure:NoSchedule
                     node.kubernetes.io/not-ready:NoExecute
                     node.kubernetes.io/unreachable:NoExecute
    Events:
      Type     Reason                 Age                 From                    Message
      ----     ------                 ----                ----                    -------
      Normal   SuccessfulMountVolume  34m                 kubelet, kube-master01  MountVolume.SetUp succeeded for volume "cni-net-dir"
      Normal   SuccessfulMountVolume  34m                 kubelet, kube-master01  MountVolume.SetUp succeeded for volume "var-run-calico"
      Normal   SuccessfulMountVolume  34m                 kubelet, kube-master01  MountVolume.SetUp succeeded for volume "cni-bin-dir"
      Normal   SuccessfulMountVolume  34m                 kubelet, kube-master01  MountVolume.SetUp succeeded for volume "lib-modules"
      Normal   SuccessfulMountVolume  34m                 kubelet, kube-master01  MountVolume.SetUp succeeded for volume "calico-node-token-p7d9n"
      Normal   SuccessfulMountVolume  34m                 kubelet, kube-master01  MountVolume.SetUp succeeded for volume "etcd-certs"
      Normal   Created                34m                 kubelet, kube-master01  Created container
      Normal   Pulled                 34m                 kubelet, kube-master01  Container image "quay.io/calico/node:v3.0.1" already present on machine
      Normal   Started                34m                 kubelet, kube-master01  Started container
      Normal   Started                34m (x3 over 34m)   kubelet, kube-master01  Started container
      Normal   Pulled                 33m (x4 over 34m)   kubelet, kube-master01  Container image "quay.io/calico/cni:v2.0.0" already present on machine
      Normal   Created                33m (x4 over 34m)   kubelet, kube-master01  Created container
      Warning  BackOff                4m (x139 over 34m)  kubelet, kube-master01  Back-off restarting failed container