When using DKP to deploy Kubernetes clusters in a vSphere environment with self-signed certificates, the TLS thumbprint must be trusted, otherwise the cluster-api vSphere provider won’t be able to communicate with the vCenter API.
If you have a DKP cluster running and the vCenter appliance is patched or upgraded and the TLS thumbprint is changed as a consequence, some controllers won’t be able to communicate with the vCenter API and actions like persistent volumes creation/deletion won’t be possible.
Here is an example of what type of event is logged by the vsphere-csi-controller when is using an outdated TLS thumbprint:
{"level":"error","time":"2023-01-04T18:46:10.304026066Z","caller":"service/driver.go:157","msg":"failed to run the driver. Err: +Post \"https://10.0.0.9:443/sdk\": host \"10.0.0.9:443\" thumbprint does not match \"69:3B:BB:FD:BC:F0:83:A3:9D:2D:49:3A:B1:08:07:E8:7E:AC:C8:03\"","TraceId":"eb60d75f-7cd4-44d4-8968-d1587dc280fe","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service.(*vsphereCSIDriver).Run\n\t/build/pkg/csi/service/driver.go:157\nmain.main\n\t/build/cmd/vsphere-csi/main.go:89\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:225"}
Some customers have reached out asking which DKP objects must be updated in order to avoid the aforementioned issues. Below we describe the cluster-api objects where the TLS thumbprint must be updated to avoid disrupting the cluster life-cycle:
The first object where the TLS thumbprint is referred to is the vspherecluster. The thumbprint can be updated with the following command:
kubectl patch vspherecluster <CLUSTER_NAME> --type=merge -p '{"spec": {"thumbprint": "<TLS_THUMBPRINT>"}}'
The vspheremachinetemplate objects, both control-plane and worker, refer to the TLS thumbprint but these are immutable objects. Because of this, there is no reason to patch these objects.
The secret vsphere-config-secret in the vmware-system-csi namespace is mounted as a volume and used by the vSphere CSI driver. To path the secret please use the command below. Please remember that the value of csi-vsphere.conf must be base64 encoded.
kubectl patch secret vsphere-config-secret -n vmware-system-csi --type='json'-p='[{"op" : "replace" ,"path" : "/data/csi-vsphere.conf" ,"value" : "<BASE64 Encoded>"}]'
Lastly, the vsphere-cloud-config configmap in the kube-system namespace must be updated as well. The information in this configmap is used by vsphere-cloud-controller-manager. To update the TLS thumbprint please patch the configmap with the following command:
kubectl --kubeconfig <CLUSTER_NAME>-workload.conf patch cm vsphere-cloud-config -n kube-system --type=merge -p'{"data": {"vsphere.conf": "global:\n secretName: cloud-provider-vsphere-credentials\n secretNamespace: kube-system\n thumbprint: <TLS_THUMBPRINT>\nvcenter:\n <vCenter_Address>:\n datacenters:\n - 'dc1'\n secretName: cloud-provider-vsphere-credentials\n secretNamespace: kube-system\n server: '<vCenter_Address>'\n thumbprint: <TLS_THUMBPRINT>\n" }}'