Problem
One of the worker nodes is unable to create new pods. The new pods appear stuck in ContainerCreating
stage with the following log:
Warning FailedCreatePodSandBox 24s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "0afdc348f34970289e54db96b6f22a3ffaae7d95e14bc99bc61967d16595a663": rate: Wait(n=1) would exceed context deadline
Kubelet on the problematic node reports the following errors related to Calico:
Feb 08 08:28:27 worker-1.domain.my kubelet[1334]: E0208 08:28:27.872448 1334 kuberuntime_sandbox.go:69] CreatePodSandbox for pod "prometheus-kubeaddons-set-grafana-home-dashboard-164430480bpz29_kubeaddons(24591d80-3839-4b0c-af27-07faab08202f)" failed: rpc error: code = Unknown desc = failed to setup network for sandbox "e08a1e8cdfaf2b8629157d704af25b742cacd8d769ea8617b7004c1ab1fa9da2": Get "https://[10.1.0.1]:443/apis/crd.projectcalico.org/v1/ipamblocks": context deadline exceeded
Solution
The possible cause of the problem is that the Calico IPAM block table became too big due to an IP leak. To check if that is the case, please, inspect the ipamblock
object:
$ kubectl get ipamblock -o yaml | wc -l
110916
If the size of the IPAM block document is suspiciously big and if you see that the amount of the allocated IP addresses is significantly larger than the amount of pods, then you can utilize the calicoctl CLI tool to fix the problem:
1. Run the following command on your workstation to get the report for problematic IP addresses:
calicoctl ipam check --show-problem-ips -o ip.json
2. Run the following command on your workstation to release the problematic IP addresses from Calico:
calicoctl ipam release --from-report=./ip.json
This should clean the problematic IP ranges and allow new pods to start. You can find more information about the calicoctl tool here: