When deploying a Load Balancer type service in an AWS-based deployment, there is a base requirement that for every availability zone in use within the VPC, there is a subnet with at least eight free IP addresses. If there is not, you will see errors in your controller-manager pod similar to this:
W0104 15:05:43.297154 1 aws.go:3497] Found multiple subnets in AZ "us-east-1a"; choosing "subnet-123" between subnets "subnet-abc" and "subnet-def"
W0104 15:05:43.297162 1 aws.go:3497] Found multiple subnets in AZ "us-east-1b"; choosing "subnet-234" between subnets "subnet-gji" and "subnet-jkl"
W0104 15:05:43.297169 1 aws.go:3497] Found multiple subnets in AZ "us-east-1c"; choosing "subnet-567" between subnets "subnet-mno" and "subnet-pqr"
W0104 15:05:43.297177 1 aws.go:3497] Found multiple subnets in AZ "us-east-1d"; choosing "subnet-678" between subnets "subnet-stu" and "subnet-vwx"
W0104 15:05:43.297185 1 aws.go:3497] Found multiple subnets in AZ "us-east-1e"; choosing "subnet-789 between subnets "subnet-yyz" and "subnet-zzy"
I0104 15:05:44.724528 1 aws_loadbalancer.go:1009] Creating load balancer for <LoadBalancer service> with name: <name>
E0104 15:05:45.638642 1 controller.go:275] error processing service <LoadBalancer service> (will retry): failed to ensure load balancer: InvalidSubnet: Not enough IP space available in subnet-789. ELB requires at least 8 free IP addresses in each subnet.
status code: 400, request id: <>
In this case, fewer than eight IP addresses are available in us-east-1e, which causes the service to fail to deploy. If possible, simply ensuring eight free IP addresses are available will resolve this issue; if this is not possible, specifying specific subnets for use can fix this error. When deploying the Load Balancer service, you can add annotations to inform the AWS controller which subnets you are interested in deploying to. This can be done with the following service annotation:
"service.beta.kubernetes.io/aws-load-balancer-subnets": "subnet-123, subnet-345, subnet-678"
An alternative to this method is specifying a security group applied to a subset of subnets, this can be done with the following annotation:
"service.beta.kubernetes.io/aws-load-balancer-security-groups": "sg-1, sg2"
Adding these will temporarily allow a load balancer to be created; DKP has functionality to reconcile the state of the cluster. Because of this, any on-the-fly changes to components will most likely be reverted automatically. To prevent this, you will need to create an overrides configmap for traefik and then add these annotations to it along with patching in the overrides configmap to the relevant appdeployment object. To do so, please follow the steps below:
1. Create an overrides configmap with your desired annotation:
apiVersion: v1
data:
values.yaml: |
---
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-security-groups": "sg-1, sg2"
kind: ConfigMap
metadata:
name: traefik-overrides
namespace: kommander
2. Apply it to the cluster
kubectl create -f overrides.yaml
3. Patch in your configmap to your appdeployment object, please ensure that your configmap is located in the proper namespace along with being named consistently between the two resources:
kubectl edit appdeployment -n kommander traefik
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
creationTimestamp: "2023-04-10T15:33:09Z"
finalizers:
- kommander.mesosphere.io/appdeployment
- kommander.mesosphere.io/appsconfigfederation
generation: 2
name: traefik
namespace: kommander
resourceVersion: "28193"
uid: e31a5c9a-28d8-4a9d-bc12-6064db2607e8
spec:
appRef:
kind: ClusterApp
name: traefik-10.30.1
configOverrides:
name: traefik-overrides
status:
clusters:
- conditions:
- status: "True"
type: AppDeploymentEnabled
name: host-cluster
observedGeneration: 2
After some time, you should see your helmrelease enter a reconciling state; if not, you can follow the steps in this Knowledge Article to force reconciliation. If after some time you do not see your helmrelease reconcile and your overrides configmap in the 'valuesFrom' section of its .spec, you can delete the helmrelease, and it will be recreated after a few moments. Please note: this will create a new load balancer service for Traefik with a new DNS entry associated with it.