In this article, we describe the most common failures user have encountered when building a cluster-api compliant images to be used by the vSphere provider in DKP 2.2.x, and how to remediate these failures.
Template has no snapshots
If the user forgets to take a snapshot after shutting down the VM, then the following exception will be reported. This happens because, by default, "linked_clone" is set to "true" which instructs packer to create a VM from the latest snapshot.
==> vsphere-clone: Cloning VM...
Build 'vsphere-clone' errored after 237 milliseconds 524 microseconds: `linked_clone=true`, but template has no snapshots
To remediate, a snapshot should be taken.
Template cannot be found
When the template is not present or simply not in the place with respect the “”, the following exception will be returned:
==> vsphere-clone: Cloning VM...
Build 'vsphere-clone' errored after 527 milliseconds 174 microseconds: Error waiting for vm Clone to complete: A specified parameter was not correct: spec.pool
To fix it, please review how resources are organized in vCenter and make sure the host where the template exists is part of the cluster specified in the packer configuration.
packer:
...
cluster: "Cluster-A"
datacenter: "DC-CAPIV"
datastore: "datastore-capiv"
folder: "konvoy-capi-vsphere"
insecure_connection: "true"
network: "CAPI-V-Network"
resource_pool: "ResourcePool-CAPIV"
template: "base-os-rhel84"
...
Network configuration
If the VM or vSphere network environment configuration does not allow internet access, package repositories cannot be configured and the following exception will be reported:
OSError: Curl error (7): Couldn't connect to server for https://repo.almalinux.org/almalinux/RPM-GPG-KEY-AlmaLinux [Failed to connect to repo.almalinux.org port 443: No route to host]\nConnection to 127.0.0.1 closed.\r\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}
vCenter appliance is configured with a self-signed SSL certificate:
If vCenter Appliance is using a self-signed SSL certificate, packer will complain if insecure_connection parameter is set to "false" in the images/ova/os-version.yaml. The following error will be reported:
./konvoy-image build images/ova/rhel-84.yaml --overrides overrides_docker.yaml
...
Build 'vsphere-clone' errored after 33 milliseconds 931 microseconds: Post "https://10.0.0.9/sdk": x509: cannot validate certificate for 10.0.0.9 because it doesn't contain any IP SANs
==> Wait completed after 34 milliseconds 44 microseconds
==> Some builds didn't complete successfully and had errors:
--> vsphere-clone: Post "https://10.0.0.9/sdk": x509: cannot validate certificate for 10.0.0.9 because it doesn't contain any IP SANs
...
To resolve this one please set the parameter to true:
insecure_connection: "true"
Red Hat registration
When trying to build a compliant RHEL image, if the system is not registered with Red Hat and no access to Red Hat repos, therefore package installation will fail:
vsphere-clone: fatal: [default]: FAILED! => {"attempts": 5, "changed": false, "failures": ["No package yum-utils available.", "No package yum-plugin-versionlock available."], "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}
To remedy, please register the VM with the following commands:
subscription-manager register --username <username> --password <password> --auto-attach
SSH access issues
In KIB, username and password is defaulted both to “builder”, therefore if a user with such password is not configured in the base OS image, then the template creation will fail as SSH access would not be possible:
==> vsphere-clone: Waiting for SSH to become available...
==> vsphere-clone: Error waiting for SSH: Packer experienced an authentication error when trying to connect via SSH. This can happen if your username/password are wrong. You may want to double-check your credentials as part of your debugging process. original error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none password], no supported methods remain
To solve this issue, the user should either make sure that user “builder” with password “builder” with admin privileges is configured in the base OS image or override KIB values for the user and password configured in the base OS image. For details on how to override username/password in KIB please refer to this article.