Introduction
When a nodepool is created, the clusterapi creates a machinedeployment and the clusterapi provider will attempt to full fill the number of replicas desired, by spinning up new nodes to join the cluster. These new nodes are automatically labeled by different systems in place, like for example, with node-feature-discovery
.
But for cases where in a user wants to assign a specific label on the nodes, for let's say, workload node affinity. It would be helpful if the new nodes are labelled dynamically. Specially for using cluster-autoscaler
, wherein new nodes are created and joined in the cluster, and you want your pending workloads to be scheduled on the new nodes.
Currently, this dynamic label sync between the machinedeployment
and nodes is still a feature request being worked on, in the upstream clusterapi project.
The current existing solution, is to label the nodes via the kubelet --node-labels
kubeletExtraArgs
flag, which will be demonstrated below.
Disclaimer: This approach has drawbacks, that are discussed in the google doc posted in https://github.com/kubernetes-sigs/cluster-api/issues/493
Creating the nodepool
The nodepool creation guide - ex. in GCP. Explains how to create a nodepool on an existing cluster. To be able to modify the nodepool objects, we will run the command with --dry-run
and --output=yaml
flags.
dkp create nodepool gcp ${NODEPOOL_NAME} \ --cluster-name=${CLUSTER_NAME} \ --kubeconfig=${CLUSTER_NAME}.conf \ --image $IMAGE_NAME \ --zone us-west1-b \ --replicas=1 \ --dry-run \ --output=yaml > ${NODEPOOL_NAME}.yaml
Edit the ${NODEPOOL_NAME}.yaml
to insert the desired label. In the KubeadmConfigTemplate, insert the node-labels flag under joinConfiguration.nodeRegistration.kubeletExtraArgs
Example:
joinConfiguration: nodeRegistration: kubeletExtraArgs: cloud-provider: gce node-labels: "cluster.environment=staging"
Apply this in the cluster
kubectl create -f ${NODEPOOL_NAME}.yaml --kubeconfig ${CLUSTER_NAME}.conf
Describe a new node to confirm if the labels exists.
With the desired labels in place, the workload can be scheduled on the intended nodes with nodeSelector
, nodeAffinity
, nodeAntiAffinity
or taints
and tolerations
.
Example: