Description of Issue
You may have a cluster which uses anonymous login for the default docker registry that has reached its download limit. In this case, you need to change the login to use an authenticated user which has a higher pull limit.
There are also other reasons why you way wish to change the container registry or registry credentials for which this solution can be used.
Solution
Follow these steps to update the docker credentials on each node. This applies to AWS clusters and requires the AWS session manger:
https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-working-with-install-plugin.html
1. Create a secret and containing the registry details (this example uses the default docker registry ):
apiVersion: v1
kind: Secret
metadata:
labels:
cluster.x-k8s.io/cluster-name: <CLUSTERNAME>
clusterctl.cluster.x-k8s.io/move: ""
name: <CUSTERNAME>-control-plane-containerd-configuration
namespace: default
stringData:
mirror: |-
# override all the mirrors configuration
# Containerd automatically appends mirrors."docker.io"
# need to explicitly override mirrors."docker.io" with the mirror to pull images from dockerhub
[plugins."io.containerd.grpc.v1.cri".registry.configs."registry-1.docker.io".auth]
username = "<username>"
password = "<password>"
type: Opaque
2. Edit the KubeadmControlPlane object of the AWS cluster.
kubectl edit KubeadmControlPlane <CLUSTERNAME>-control-plane
Add the following file into spec.kubeadmConfig.clusterConfiguration.files
- contentFrom:
secret:
key: mirror
name: mkamsika-1685953878-control-plane-containerd-configuration
path: /etc/containerd/konvoy-conf.d/konvoy-mirror.toml
permissions: "0600"
3. Get a list of machines
kubectl get machines
NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION
mkamsika-1685953878-control-plane-7dbkz mkamsika-1685953878 Provisioning 8s v1.25.4
mkamsika-1685953878-control-plane-kpzcc mkamsika-1685953878 ip-10-0-138-96.us-west-2.compute.internal aws:///us-west-2b/i-0afb1e21f69b792bf Running 3h14m v1.25.4
mkamsika-1685953878-control-plane-pnm7r mkamsika-1685953878 ip-10-0-74-232.us-west-2.compute.internal aws:///us-west-2a/i-072fee875216fbeba Running 93m v1.25.4
mkamsika-1685953878-control-plane-q48gt mkamsika-1685953878 ip-10-0-212-24.us-west-2.compute.internal aws:///us-west-2c/i-0ab3125d5574b143a Running 101m v1.25.4
mkamsika-1685953878-md-0-774bb6cc84-5whkf mkamsika-1685953878 ip-10-0-103-80.us-west-2.compute.internal aws:///us-west-2a/i-0372b0ae587cf4ace Running 5h3m v1.24.6
mkamsika-1685953878-md-0-774bb6cc84-c8fnh mkamsika-1685953878 ip-10-0-111-220.us-west-2.compute.internal aws:///us-west-2a/i-0f8a9e63d900547b4 Running 5h3m v1.24.6
mkamsika-1685953878-md-0-774bb6cc84-ddg8n mkamsika-1685953878 ip-10-0-91-101.us-west-2.compute.internal aws:///us-west-2a/i-0c0e559e0c1ab2c56 Running 5h3m v1.24.6
mkamsika-1685953878-md-0-774bb6cc84-wqjxx mkamsika-1685953878 ip-10-0-106-159.us-west-2.compute.internal aws:///us-west-2a/i-0c6b9b4b8d2aec23e Running 5h3m v1.24.6
mkamsika-1685953878-md-0-7fb5c55bd6-rptj6 mkamsika-1685953878 ip-10-0-92-237.us-west-2.compute.internal aws:///us-west-2a/i-0c219b9adfe1b59c2 Running 79m v1.25.4
You will see that machines will be deleted and then recreated.
4. If any machines cannot be deleted due to pod distribution budgets then look for any nodes which have pods running that are unable to pull images using the old credentials we have just replaced.
kubectl get pods -A -owide | grep ImagePullBackOff
From the out put of this command you will be able to identify existing machines that need the registry credentials updating.
5. Remote into the first machine using the ProviderID
#Set the region
export AWS_REGION=us-west-2
#Connect to host
aws ssm start-session --target i-07c6b422e6360f029
6. Edit the required config file
sudo vi /etc/containerd/config.toml
Add in the following section under plugins. Make sure you add in the credentials!
[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.configs]
[plugins."io.containerd.grpc.v1.cri".registry.configs."registry-1.docker.io".auth]
username ="<username>"
password = "<password>"
auth = ""
identitytoken = ""
4. Restart containerd
sudo systemctl restart containerd
5. Repeat for each required InstanceID