When deploying Kubernetes clusters with DKP users could encounter that the deployment is stuck and resources are not getting deployed with VpcReconciliationFailed as the reason because the capa-controller does not have the AWS credentials configured.
./dkp describe cluster --cluster-name aws-dkp240
NAME READY SEVERITY REASON SINCE MESSAGE
Cluster/aws-dkp240 False Warning ScalingUp 48s Scaling up control plane to 3 replicas (actual 0)
├─ClusterInfrastructure - AWSCluster/aws-dkp240 False Warning VpcReconciliationFailed 42s 0 of 7 completed
├─ControlPlane - KubeadmControlPlane/dkp240-control-plane False Warning ScalingUp 48s Scaling up control plane to 3 replicas (actual 0)
└─Workers
└─MachineDeployment/aws-dkp240-md-0 False Warning WaitingForAvailableMachines 48s Minimum availability requires 4 replicas, current 0 available
├─Machine/aws-dkp240-md-0-7987b5c774-8ddtk False Info WaitingForClusterInfrastructure 47s 0 of 2 completed
├─Machine/aws-dkp240-md-0-7987b5c774-gtnqk False Info WaitingForClusterInfrastructure 47s 0 of 2 completed
├─Machine/aws-dkp240-md-0-7987b5c774-pkbvc False Info WaitingForClusterInfrastructure 47s 0 of 2 completed
└─Machine/aws-dkp240-md-0-7987b5c774-xjbpx False Info WaitingForClusterInfrastructure 46s 0 of 2 completed
To confirm whether the lack of credentials is the issue, the user could check the capa-controller logs, it is complaining about lack of credentials as can be seen below:
kubectl -n capa-system -l=control-plane=capa-controller-manager -c manager logs -f
E0722 00:10:09.691820 1 awscluster_controller.go:281] controller/awscluster "msg"="failed to reconcile network" "error"="failed to create new vpc: failed to create vpc: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors" "cluster"="aws-dkp240" "name"="aws-dkp240" "namespace"="default" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="AWSCluster"
E0722 00:10:09.694757 1 controller.go:317] controller/awscluster "msg"="Reconciler error" "error"="failed to create new vpc: failed to create vpc: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors" "name"="aws-dkp240" "namespace"="default" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="AWSCluster"
or simply the status of the AWSCluster resource with the command:
kubectl get AWSCluster aws-dkp240 -ojsonpath='{.status}'
{"conditions":[{"lastTransitionTime":"2023-07-24T22:32:35Z","message":"0 of 7 completed","reason":"VpcReconciliationFailed","severity":"Warning","status":"False","type":"Ready"},{"lastTransitionTime":"2023-07-24T22:32:29Z","status":"True","type":"PrincipalCredentialRetrieved"},{"lastTransitionTime":"2023-07-24T22:32:29Z","status":"True","type":"PrincipalUsageAllowed"},{"lastTransitionTime":"2023-07-24T22:32:35Z","message":"failed to create new vpc: failed to create vpc: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors","reason":"VpcReconciliationFailed","severity":"Warning","status":"False","type":"VpcReady"}],"ready":false}
This could happen when the bootstrap cluster creation is created without specifying --with-aws-bootstrap-credentials=true, so the credentials are not included in the secret capa-manager-bootstrap-credentials.
To solve the issue, the user has to update the secret holding the credentials by running the command:
./dkp update bootstrap credentials aws
To confirm that the secret has been updated with the AWS credentials, the following command should be executed:
kubectl get secrets capa-manager-bootstrap-credentials -n capa-system -ojsonpath='{.data.\credentials}'|base64 --decode