Problem
Sometimes you must restart the core Kubernetes components in a DKP cluster: etcd, kube-apiserver, kube-controller-manager, or kube-scheduler. The problem is these pods are static, and deleting static pods with the kubectl delete <pod name>
command is impossible. The AGE
column of the kubectl get pods
command output may be misleading in describing static pods: even if it shows that the static pod restarted recently, the correspondent pod containers were not restarted.
Solution
To restart a container of one of the core components, you need to move it from the /etc/kubernetes/manifests
directory on the control plane node host. Below are the step for restarting the kube-apiserver components:
1) SSH to the control plane node, or follow this guide if you don't have SSH access (in this case, you need to adjust the filesystem paths with the /host
prefix).
2) Move the kube-apiserver
manifest from the manifests
directory: mv /etc/kubernetes/manifests/kube-apiserver.yaml /root/
3) Wait till the correspondent kube-apiserver
pod is gone:
zd10282$ kubectl get pods -n kube-system | grep api kube-apiserver-ip-10-0-203-99.us-west-2.compute.internal 1/1 Running 0 36m kube-apiserver-ip-10-0-69-238.us-west-2.compute.internal 1/1 Running 1 (39m ago) 38m
4) Move the kube-apiserver
manifest back: mv /root/kube-apiserver.yaml /etc/kubernetes/manifests/
5) Wait till the correspondent kube-apiserver pod is back:
zd10282$ kubectl get pods -n kube-system | grep api kube-apiserver-ip-10-0-166-232.us-west-2.compute.internal 1/1 Running 0 15s kube-apiserver-ip-10-0-203-99.us-west-2.compute.internal 1/1 Running 0 39m kube-apiserver-ip-10-0-69-238.us-west-2.compute.internal 1/1 Running 1 (41m ago) 41m
6) Remember to restart the rest of the pods on the rest of the control plane nodes if needed. To avoid the risk of causing a service outage or losing control of your cluster, you must restart the pods one by one.