When upgrading a Kubernetes cluster, Konvoy makes a decision about the safety of the upgrade based on the results of multiple checks performed on both control-plane and worker nodes, as described in our documentation.
Specifically when upgrading worker nodes, at the stage [Determining Upgrade Safety For Nodes in Pool ...], konvoy determines whether the user workload will be impacted (data loss or availability) by the worker node upgrade or not based on the following conditions:
1. Are there any pods using a hostPath, emptyDir volume, or a hostPath PersistentVolume?
2. Keep track of how many pods managed by replication controllers and replicasets are running on this node. If all replicas are running on the node, Konvoy returns an error, as it would take the workload down.
3. Are there any pods running on this node that are not managed by a controller?
4. Are there any pods running on this node that is managed by a controller, and have replicas of less than 2?
5. Are there any daemonset managed pods running on this node? If so, verify that it is not the only one.
6. Are there any pods that belong to a job running on this node?
If any of these conditions are met, the upgrade is deemed unsafe and is skipped on the worker node. However, as in certain scenarios the safety checks can be ignored, the operator can do so by using the “--force-upgrade” flag so the upgrade can be performed for the worker nodes.
Specifically when upgrading worker nodes, at the stage [Determining Upgrade Safety For Nodes in Pool ...], konvoy determines whether the user workload will be impacted (data loss or availability) by the worker node upgrade or not based on the following conditions:
1. Are there any pods using a hostPath, emptyDir volume, or a hostPath PersistentVolume?
2. Keep track of how many pods managed by replication controllers and replicasets are running on this node. If all replicas are running on the node, Konvoy returns an error, as it would take the workload down.
3. Are there any pods running on this node that are not managed by a controller?
4. Are there any pods running on this node that is managed by a controller, and have replicas of less than 2?
5. Are there any daemonset managed pods running on this node? If so, verify that it is not the only one.
6. Are there any pods that belong to a job running on this node?
If any of these conditions are met, the upgrade is deemed unsafe and is skipped on the worker node. However, as in certain scenarios the safety checks can be ignored, the operator can do so by using the “--force-upgrade” flag so the upgrade can be performed for the worker nodes.