Recently, a customer who was utilizing Kafka in DKP encountered an issue when attempting to create topics. This issue arose due to the expiration of the TLS certificate associated with the Kafka Operator.
The TLS certificate is initially generated with a validity period of 365 days. Consequently, if the cluster isn't upgraded or if the helm release isn't reconciled within that timeframe, this problem is likely to manifest.
Both the CA and TLS certificates are securely stored within the "kafka-operator-serving-cert" secret (the namespace depends on the workspace). The CA certificate is also specified within a ValidatingWebhookConfiguration under the name "kafka-operator-validating-webhook".
To fix the issue the user should trigger the renewal of the certificate, users simply need to initiate the reconciliation process for the "kafka-operator-1" helm release by executing the following commands:
kubectl -n spencerscorp patch helmrelease kafka-operator-1 --type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": true}]'
kubectl -n spencerscorp patch helmrelease kafka-operator-1 --type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Once completed, it's essential to verify the certificate update by using the subsequent commands:
For the CA certificate:
kubectl -n <namespace> get secret kafka-operator-serving-cert -ojsonpath='{.data.ca\.crt}' | base64 -d | openssl x509 -in /dev/stdin -noout -text
For the TLS certificate:
kubectl -n spencerscorp get secret kafka-operator-serving-cert -ojsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -in /dev/stdin -noout -text
Ensuring the validity of these certificates will help maintain the secure operation of the Kafka Operator within the DKP environment.