Customer Advisory
Advisory ID: | D2IQ-2020-0010 |
---|---|
Severity: | Medium |
Synopsis: | During a Konvoy 1.5.2 to 1.6.0 upgrade, cert-manager and dependent addons fail to upgrade when the Kommander addon is disabled. |
Affected Products & Versions |
Konvoy 1.6.0 |
Issue date: | 12-18-2020 |
Updated on: | 12-18-2020 |
Problem Description
During a Konvoy 1.5.2 to 1.6.0 upgrade, cert-manager and dependent addons fail to upgrade when the Kommander addon is disabled.
Context & Symptoms
A bug exists in the cert-manager upgrade process that causes upgrade failure when the Kommander addon is disabled in the cluster.yaml. This is due to a patch job that requires a Kommander resource to be present.
To identify the issue, in the kubeaddons controller logs you will observe messages similar to the following:
$ kubectl logs -l control-plane=kubeaddons-controller-manager -n kubeadddons --tail -1 2020-12-01T22:39:09.631Z ERROR kubeaddons-controller.Addon.Helm3.install failed helm install {"ClusterAddon": "cert-manager", "namespace": "cert-manager", "generation": 2, "chart": "cert-manager-setup", "addon": "cert-manager", "version": "0.2.6", "valuesCRC32": "a95e27b5", "error": "failed post-install: job failed: BackoffLimitExceeded"} 2020-12-01T22:52:58.160Z ERROR kubeaddons-controller.Addon.Helm3.Upgrade upgrade failed {"Addon": "dex", "namespace": "kubeaddons", "generation": 5, "chart": "dex", "addon": "dex", "version": "2.9.0", "valuesCRC32": "310bc9e2", "error": "current release manifest contains removed kubernetes api(s) for this kubernetes version and it is therefore unable to build the kubernetes objects for performing the diff. error from kubernetes: [unable to recognize \"\": no matches for kind \"Certificate\" in version \"certmanager.k8s.io/v1alpha1\", unable to recognize \"\": no matches for kind \"Issuer\" in version \"certmanager.k8s.io/v1alpha1\"]", "errorVerbose": "[unable to recognize \"\": no matches for kind \"Certificate\" in version \"certmanager.k8s.io/v1alpha1\", unable to recognize \"\": no matches for kind \"Issuer\" in version \"certmanager.k8s.io/v1alpha1\"]\ncurrent release manifest contains removed kubernetes api(s) for this kubernetes version and it is therefore unable to build the kubernetes objects for performing the diff. error from kubernetes\nhelm.sh/helm/v3/pkg/action.
Any 1.6.0 Konvoy version not using kubernetes-base-addons version stable-1.18-3.0.1 with Kommander disabled is affected during upgrade from Konvoy 1.5.x. Kommander is disabled if the Kommander addon is either not present or set to enabled: false in the cluster.yaml.
Workaround/Solution
Before attempting the upgrade, the configVersion for the kubernetes-base-addons should be set to stable-1.18-3.0.1:addons: - configRepository: https://github.com/mesosphere/kubernetes-base-addons configVersion: stable-1.18-3.0.1
All other versions should remain the same as provided by the 1.6.0 release binary, then you can proceed with the upgrade as directed.
If an upgrade was already attempted, set the kubernetes-base-addons convigVersion as described above, then run `konvoy deploy addons`.
If running Konvoy air-gapped, you can manually execute the following job once the upgrade is blocked and failing.
apiVersion: batch/v1 kind: Job metadata: namespace: cert-manager name: cert-manager-upgrade-restore spec: template: spec: serviceAccountName: cert-manager-upgrade containers: - name: cert-manager-upgrade image: mesosphere/kubeaddons-addon-initializer:v0.4.2 command: ["cert-manager-upgrade", "restore"] env: - name: "CERT_MANAGER_NAMESPACE" value: cert-manager restartPolicy: Never ttlSecondsAfterFinished: 900