Description of the Issue:
Konvoy deployment fails because containerd image registry endpoints are misconfigured (config.toml) and containerd has issues pulling down images from the registry when the konvoy binary is NOT executed in the same directory as images.json is located.
To confirm whether this issue has been encountered, you could check if containerd is having issues pulling images from the registry, even though the images are present in the registry and can be pulled down manually. This is an example of an event logged in when this issue occurs:
Nov 06 12:38:36 konvoy-airgapped-master-1 containerd[26429]: time="2020-11-06T12:38:36.345616943Z" level=error msg="PullImage \"mesosphere/keepalived-snmp:v0.2\" failed" error="failed to pull and unpack image \"docker.io/mesosphere/keepalived-snmp:v0.2\": failed to resolve reference \"docker.io/mesosphere/keepalived-snmp:v0.2\": failed to do request: Head https://registry-1.docker.io/v2/mesosphere/keepalived-snmp/manifests/v0.2: dial tcp: lookup registry-1.docker.io on 192.168.122.1:53: no such host""
When provisioning a konvoy cluster in an air-gapped environment (cloud or on-prem), the parameter “imageRegistries” (in the file cluster.yaml) defines which image registry should be used by containerd to pull images from.
For example, when “imageRegistries” is defined as:
imageRegistries: - server: https://my-registry.network:5000 username: "your-user" password: "the-passwd" default: true
Konvoy configures containerd images registry endpoints in /etc/containerd/config.toml based on the information in images.json. Image registry mirrors are only configured when “default: true” is specified for imageRegistries in cluster.yaml. This is an example of the images registry configuration in config.toml:
[plugins."io.containerd.grpc.v1.cri".registry] [plugins."io.containerd.grpc.v1.cri".registry.mirrors] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.elastic.co"] endpoint = ["https://my-registry.network:5000/v2/"] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"] endpoint = ["https://my-registry.network:5000/v2/","https://registry-1.docker.io"] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."gcr.io"] endpoint = ["https://my-registry.network:5000/v2/"] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."quay.io"] endpoint = ["https://my-registry.network:5000/v2/"] [plugins."io.containerd.grpc.v1.cri".registry.configs] [plugins."io.containerd.grpc.v1.cri".registry.configs."my-registry.network:5000".auth] username = "your-user" password = "the-passwd" auth = "" identitytoken = ""
Solution:
The konvoy binary should be placed and executed in the same directory where the images.json is located, otherwise containerd image registry endpoints won’t be correctly specified and the deployment will fail.