Problem
During installation of kommander, the helmrelease result has failed installation related to kube-prometheus-stack
NAMESPACE NAME READY STATUS AGE kommander centralized-grafana False Helm install failed: failed pre-install: timed out waiting for the condition 34m kommander karma False dependency 'kommander/karma-traefik' is not ready 34m kommander karma-traefik False dependency 'kommander/kube-prometheus-stack' is not ready 34m kommander kube-prometheus-stack False Helm install failed: failed pre-install: timed out waiting for the condition 34m kommander prometheus-adapter False dependency 'kommander/kube-prometheus-stack' is not ready 34m kommander prometheus-thanos-traefik False dependency 'kommander/kube-prometheus-stack' is not ready 34m
while the following jobs are failing
crd-upgrades-part1-4wtwr 0/1 CrashLoopBackOff 5 3m31s crd-upgrades-part2-n4fsl 0/1 CrashLoopBackOff 5 3m30s
Further investigation on the logs of the crd-upgrade jobs, shows the cause of the error
The CustomResourceDefinition "alertmanagers.monitoring.coreos.com" is invalid: spec.preserveUnknownFields: Invalid value: true: must be false in order to use defaults in the schema}}
This is a known issue discussed on the github issue
https://github.com/prometheus-operator/prometheus-operator/issues/4206
Solution
The issue was fixed on version 0.52.0 of prometheus-operator. But kommander 2.1.1 is shipped with v0.50.0.
As a workaround, we can upgrade the image version of the kube-prometheus-stack to the fixed version 0.52.0.
1. First identify the configMap used by kube-prometheus-stackkubectl get appdeployment kube-prometheus-stack -n kommander -oyaml
it would be the name under configOverrides
2. Edit the configMapkubectl edit cm <name-of-kube-prometheus-stack-overrides> -n kommander
under the values.yaml section, just add the section below. If there is a prometheusOperator already, just add the values.
apiVersion: v1 data: values.yaml: | prometheusOperator: image: tag: v0.52.0 prometheusConfigReloaderImage: tag: v0.52.0
3. Delete the existing CRD and kube-prometheus-stack helmrelease.
kubectl delete crd prometheuses.monitoring.coreos.com kubectl -n kommander delete hr kube-prometheus-stack
This would cause the new image to be pulled with the fixed version.
Permanent fix has been raised to engineering through internal ticket COPS-7163.