Problem
When deploying Konvoy to AWS through a proxy, there may be issues creating your infrastructure if you only have internet access through a proxy. If your bootstrap host is network restricted to only allow access through a proxy you will see errors similar to the below in the capa-controller manager pod:
E0201 07:37:15.041796 1 controller.go:304] controller/awscluster "msg"="Reconciler error" "error"="failed to create new vpc: failed to create vpc: RequestError: send request failed\ncaused by: Post \"https://ec2.eu-west-1.amazonaws.com/\": dial tcp 54.239.39.130:443: i/o timeout" "name"="<cluster name>" "namespace"="default" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="AWSCluster"
This is due to a limitation of cluster-API (CAPI). Currently, the CAPI pods do not pick up local proxy variables, which will cause your deployment to fail as the AWS endpoints will not be reachable.
Solution
If there is no way to get access through other means than the proxy, you will need to manually edit each pod's deployment to include your http_proxy, https_proxy, and no_proxy environment variables. The following is a list of deployments that will need to be edited:
NAMESPACE NAME
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager
capi-system capi-controller-manager
capa-system capa-controller-manager
If you are experiencing a similar issue with Azure or preprovisioned, you can also add the proxy variables to the below deployments:
NAMESPACE NAME
cappp-system cappp-controller-manager
capz-system capz-controller-manager
You will need to add yaml containing your proxy information to the spec.env section for each deployment. For most deployments, this will not exist. If that is the case, please add this section just below the args for each container as is done below:
- args:
- --metrics-bind-addr=127.0.0.1:8080
- --leader-elect
- --feature-gates=EKS=true,EKSEnableIAM=false,EKSAllowAddRoles=false,EKSFargate=false,MachinePool=true,EventBridgeInstanceState=false,AutoControllerIdentityCreator=true
- --service-endpoints=
env:
- name: AWS_SHARED_CREDENTIALS_FILE
value: /home/.aws/credentials
- name: http_proxy
value: <IP>:<port>
- name: https_proxy
value: <IP>:<port>
- name: no_proxy
value: localhost,127.0.0.1,10.96.0.0/12,192.168.0.0/16,kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.default.svc.cluster.local,.svc,.svc.cluster,.svc.cluster.local,169.254.169.25
Please note, the default no_proxy suggestions include .elb.amazonaws.com. In this case we will want to remove that so any calls to the ELB will flow through the proxy.
From there, the relevant CAPI pod that was modified will be restarted and contain the relevant environment variables. You can validate this by describing the pod for each deployment:
kubectl describe pod -n capa-system
Name: capa-controller-manager-74dbb67b69-n6cqq
Namespace: capa-system
Priority: 0
Node: konvoy-capi-bootstrapper-control-plane
Start Time: Wed, 23 Feb 2022 17:28:41 +0000
Labels: cluster.x-k8s.io/provider=infrastructure-aws
Annotations: kubectl.kubernetes.io/restartedAt: 2022-02-23T17:28:41Z
Status: Running
Controlled By: ReplicaSet/capa-controller-manager-74dbb67b69
Containers:
manager:
Ports: 9443/TCP, 9440/TCP
Host Ports: 0/TCP, 0/TCP
Args:
--metrics-bind-addr=127.0.0.1:8080
--leader-elect
--feature-gates=EKS=true,EKSEnableIAM=false,EKSAllowAddRoles=false,EKSFargate=false,MachinePool=true,EventBridgeInstanceState=false,AutoControllerIdentityCreator=true
--service-endpoints=
Environment:
AWS_SHARED_CREDENTIALS_FILE: /home/.aws/credentials
http_proxy: <IP>:<port>
https_proxy: <IP>:<port>
no_proxy: localhost,127.0.0.1,10.96.0.0/12,192.168.0.0/16,kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.default.svc.cluster.local,.svc,.svc.cluster,.svc.cluster.local,169.254.169.254,.elb.amazonaws.com
kube-rbac-proxy:
Image: gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0
Args:
--secure-listen-address=0.0.0.0:8443
--upstream=http://127.0.0.1:8080/
--logtostderr=true
--v=10
Environment:
http_proxy: <IP>:<port>
https_proxy: <IP>:<port>
no_proxy: localhost,127.0.0.1,10.96.0.0/12,192.168.0.0/16,kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.default.svc.cluster.local,.svc,.svc.cluster,.svc.cluster.local,169.254.169.254,.elb.amazonaws.com
Once all deployments have the correct proxy variables, the controller pods will reconcile the cluster and finish the deployment of the cluster.