Overview
In some GovCloud and other similar AWS environments, you may not have access to the AWS secrets manager and will see errors similar to "msg"="failed to create AWS Secret entry"
when deploying a cluster to AWS.
Solution
If this is the case and secrets manager is unavailable in your environment, you can use AWS parameter store instead to get your cluster up and running. There are a few steps required before deploying the cluster. The first is generating YAML for your cluster using dkp create cluster aws -c <cluster
name> --dry-run -o yaml > cluster.yaml
. This is needed because we need to modify the default backend to point towards parameter store rather than secrets manager. To do this you will need to modify the AwsMachineTemplates to both contain the below:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSMachineTemplate
metadata:
name: adoll-1005-control-plane
namespace: default
spec:
template:
spec:
cloudInit:
secureSecretsBackend: ssm-parameter-store
iamInstanceProfile: control-plane.cluster-api-provider-aws.sigs.k8s.io
imageLookupBaseOS: ubuntu-20.04
imageLookupFormat: capa-ami-{{.BaseOS}}-?{{.K8sVersion}}-*
imageLookupOrg: "258751437250"
instanceType: m5.xlarge
rootVolume:
size: 80
sshKeyName: ""
After specifying the secureSecretsBackend to be parameter store, you will then need to modify your iamInstanceProfile to contain the required permissions to deploy a cluster. If you are using the d2iq defaults, then you can modify the nodes.cluster-api-provider-aws.sigs.k8s.io policy to include the below:
...
{
"Effect": "Allow",
"Action": [
"ssm:DeleteParameter",
"ssm:GetParameter"
],
"Resource": [
"arn:*:ssm:*:*:parameter/cluster.x-k8s.io/*"
]
}
...
If you have specified your own IAM roles, you will need to add the 'ssm:DeleteParameter' and 'ssm:GetParameter' permissions to the role responsible for both your control-plane nodes and your worker nodes. If you are deploying from an EC2 instance, you will also need to add additional permissions to your IAM role separate from the ones above, if you would like to use the minimum permissions needed to deploy a cluster please use the cloudFormation template in our documentation. After deploying the template, you will still need to add the below permissions:
{
"Effect": "Allow",
"Action": [
"ssm:PutParameter",
"ssm:GetParameter",
"ssm:AddTagsToResource",
"ssm:DeleteParameter"
],
"Resource": [
"arn:*:ssm:*:*:parameter/cluster.x-k8s.io/*"
]
}
After specifying all roles and granting proper permissions, you can create the cluster by running kubectl create -f cluster.yaml
against your bootstrap cluster. While creating the cluster, you can watch for any errors or permissions issues by following the capa controller pods logs with kubectl logs -n capa-system
deployment/capa-controller-manager -f
.
Considerations
If you applied the cluster.yaml to your bootstrap and saw errors regarding missing permissions in the capa-controller logging. In that case, you will see errors similar to the below after the permissions take effect:
E0830 17:11:18.095936 1 awsmachine_controller.go:692] "msg"="Failed to create AWS Secret entry" "error"="ParameterAlreadyExists: The parameter already exists. To overwrite this value, set the overwrite option in the request to true." "secretPrefix"="/cluster.x-k8s.io/9fceb96a-d9aa-4c60-9688-3caceee39a45"
If these are present, the simplest way to resolve this is to delete the cluster from the bootstrap with 'dkp delete cluster -c <cluster name> --with-kubernetes-resources=false', delete the bootstrap, and then reapply the yaml. From their new secrets will be created and the errors resolved.