Pods stuck in ContainerCreating on one of the worker nodes – D2iQ

Problem

One of the worker nodes is unable to create new pods. The new pods appear stuck in ContainerCreating stage with the following log:

Warning  FailedCreatePodSandBox  24s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "0afdc348f34970289e54db96b6f22a3ffaae7d95e14bc99bc61967d16595a663": rate: Wait(n=1) would exceed context deadline

Kubelet on the problematic node reports the following errors related to Calico:

Feb 08 08:28:27 worker-1.domain.my kubelet[1334]: E0208 08:28:27.872448    1334 kuberuntime_sandbox.go:69] CreatePodSandbox for pod "prometheus-kubeaddons-set-grafana-home-dashboard-164430480bpz29_kubeaddons(24591d80-3839-4b0c-af27-07faab08202f)" failed: rpc error: code = Unknown desc = failed to setup network for sandbox "e08a1e8cdfaf2b8629157d704af25b742cacd8d769ea8617b7004c1ab1fa9da2": Get "https://[10.1.0.1]:443/apis/crd.projectcalico.org/v1/ipamblocks": context deadline exceeded

Solution

The possible cause of the problem is that the Calico IPAM block table became too big due to an IP leak. To check if that is the case, please, inspect the ipamblock object:

$ kubectl get ipamblock -o yaml | wc -l
110916

If the size of the IPAM block document is suspiciously big and if you see that the amount of the allocated IP addresses is significantly larger than the amount of pods, then you can utilize the calicoctl CLI tool to fix the problem:

1. Run the following command on your workstation to get the report for problematic IP addresses:

calicoctl ipam check --show-problem-ips -o ip.json

2. Run the following command on your workstation to release the problematic IP addresses from Calico:

calicoctl ipam release --from-report=./ip.json

This should clean the problematic IP ranges and allow new pods to start. You can find more information about the calicoctl tool here: