Affected Component(s): Konvoy Air Gapped Deployments
Overview/Background
When deploying Konvoy in an air gapped environment, you will need a local docker registry to host any images you would like to use with your cluster, as well as for deployment of the cluster itself. If you find that specific addons are not deploying or an image you had previously uploaded to your docker registry is not being pulled, you should check check to see if containerd's logs have any helpful messages, such as:
containerd: level=error msg="PullImage "quay.io/prometheus/snmp-exporter:v0.15.0" failed" error="failed to resolve image "quay.io/prometheus/snmp-exporter:v0.15.0": no available registry endpoint: failed to do request: Head https://quay.io/v2/prometheus/snmp-exporter/manifests/v0.15.0
We can tell that something is wrong in the above example because in our air gapped environment the URL https://quay.io/v2/prometheus/snmp-exporter/manifests/v0.15.0 is not resolvable and containerd should not be trying to pull from the official repos.
Solution
To investigate at which step in the process a failure has occurred, we can use the following tests to help. To start with, we should reference images.json, which is a file that is included in the Konvoy Air Gapped Deployment bundle. This file has information on all images necessary for Konvoy to deploy successfully. Looking at the snmp-exporter details, we can see that there is a lot of useful information given by this file:
{ "scheme": "https", "registry": "quay.io", "image": "prometheus/snmp-exporter", "tag": "v0.15.0" },
If the image is not listed in images.json, then it may not be part of a standard Konvoy deployment, or you may have an out of date images.json file. To see if all the images in images.json were succesfully pushed to our docker registry, we can list all repositories being served via:
curl -X GET https://local-registry.testcluster.com:5000/v2/_catalog --user testuser:testpassword > registryImages.json
This will return some unformatted json that we can inspect:
{ "repositories": [ "prometheus/alertmanager", "prometheus/node-exporter", "prometheus/prometheus", "prometheus/snmp-exporter", ] }
We should see "prometheus/snmp-exporter" listed as above, if it is not then we can try re-uploading it to verify there was not a problem pushing it to the repository via
konvoy config images seed
The above curl command tells us which repositories are available in the docker catalog, but it doesn't tell us which versions. We should verify that docker is serving the properly tagged versions of our images, and we can do this by querying for the specific tag information:
curl -X GET https://local-registry.testcluster.com:5000/v2/prometheus/snmp-exporter/tags/list --user testuser:testpassword
This should return the specific tag information defined in images.json:
{"name":"prometheus/snmp-exporter","tags":["v0.15.0"]}
The above examples should help you to identify any issues with your local docker registry that are preventing Konvoy from deploying, or an addon or service from being deployed once the cluster is up and running.