Overview/Background
If we examine the default configuration for a cluster.yaml, we'll see the following configuration for worker and control plane nodes:
nodePools: - name: worker count: 4 machine: rootVolumeSize: 80 rootVolumeType: gp2
- name: control-plane controlPlane: true count: 3 machine: rootVolumeSize: 80 rootVolumeType: io1 rootVolumeIOPS: 1000
If you attempt to change rootVolumeType on the worker node pool to type io1, you will get an error on konovy up:
* aws_instance.worker_pool_worker[0]: 1 error occurred: * aws_instance.worker_pool_worker.0: Error launching source instance: InvalidParameterCombination: The parameter iops must be specified for io1 volumes. status code: 400, request id: 9dd72a39-4ea5-4120-850e-b8044080f997
If you then also specify rootVolumeIOPS under worker's node pool such that it looks like this:
nodePools: - name: worker count: 4 machine: rootVolumeSize: 80 rootVolumeType: io1 rootVolumeIOPS: 1000
You will still receive the same error message informing you that the parameter iops must be specified for io1 volumes. This is due to a bug in the way that Konvoy currently processes worker node configurations and the IOPS paramenter is dropped before Terraform is able to utilize it. This causes the io1 volume configuration to also become invalid as it does not have a required iops value. In order to set our worker node root volumes to use io1 format, we must override the Terraform resource we use for provisioning as detailed here:
https://docs.d2iq.com/ksphere/konvoy/1.5/install/install-aws/advanced-provisioning/#adding-custom-terraform-resources-for-provisioning
Solution
You can provide an _override.tf file under extras/provisioner to enable this functionality
Following the instructions linked above, create a directory structure extras/provisioner in your main Konvoy directory, and then create a file called woker_override.tf with the following contents:
resource "aws_instance" "worker_pool_worker" { root_block_device { iops = "1000" volume_size = "80" volume_type = "io1" delete_on_termination = true } }
The prefix of the file name does not matter, but it must end in _override.tf. All Konvoy versions prior to 1.6.0 use Terraform 0.11. See here for more information on Terraform 0.11 overrides:
https://www.terraform.io/docs/configuration-0-11/override.html
If you then navigate back to your Konvoy working directory and run ./konvoy provision, it will successfully update your Terraform state file with the desired changes.
You can verify this by running:
terraform show state/terraform.tfstate
aws_instance.worker_pool_worker.3: ... root_block_device.0.delete_on_termination = true root_block_device.0.device_name = /dev/sda1 root_block_device.0.encrypted = false root_block_device.0.iops = 1000 root_block_device.0.kms_key_id = root_block_device.0.volume_id = vol-0d3b0ab1825dfba94 root_block_device.0.volume_size = 80 root_block_device.0.volume_type = io1
If you have multiple worker pools, you need to override each resource separately. You can do this in one file or split over multiple _override.tf files per resource. In cluster.yaml, if the worker pool is named gpu-workers:
nodePools: - name: gpu-workersThen the resource you would override in your _override.tf file would look like:
resource "aws_instance" "worker_pool_" { { } }
resource "aws_instance" "worker_pool_gpu-workers" { root_block_device { iops = "1000" volume_size = "80" volume_type = "io1" delete_on_termination = true }
When overriding the values for an object such as root_block_device, you must include ALL values in your override, not just the values you would like to modify. If you do not include all items such as delete_on_termination or volume_size then there will be configuration errors as the values from cluster.yaml are not imported properly or ignored entirely.