When configuring a cluster with a custom-domain and a self-signed certificate, users sometimes encounter issues accessing the DKP UI (/dkp/kommander/dashboard) or simply cannot login in with users managed by an identity provider (LDAP, GitHub, GitLab, etc) due to issues with certificates being expired or incorrectly configured. In this article, we provide guidance on how to troubleshoot these issues as well as some examples of this type of issue.
When troubleshooting these kinds of issues, traefik-forward-auth and dex-k8s-authenticator logs are the places to look at for evidence that could help to identify what is the root cause of the problem.
For example, when issues are related to CA certificates not being specified at all or incorrectly defined, traefik-forward-auth and dex-k8s-authenticator logs include entries stating that the certificate configured is signed by an unknown authority. Examples for these entries in the logs are shown below:
kubectl -n kommander logs -l=app.kubernetes.io/name=traefik-forward-auth
time="2023-06-08T21:24:15Z" level=fatal msg="failed to get provider configuration for https://kommander240.ddns.net/dex: Get \"https://kommander240.ddns.net/dex/.well-known/openid-configuration\": x509: certificate signed by unknown authority (hint: make sure https://kommander240.ddns.net/dex is accessible from the cluster)"
kubectl -n kommander logs -l=app=dex-k8s-authenticator
WARNING: ca-certificates.crt does not contain exactly one certificate or CRL: skipping
WARNING: ca-cert-ca.pem does not contain exactly one certificate or CRL: skipping
2023/06/08 21:25:36 Using config file: /app/configuration/config.yaml
2023/06/08 21:25:36 useClusterHostname: true
2023/06/08 21:25:36 Creating new provider https://kommander240.ddns.net/dex
2023/06/08 21:25:36 Failed to query provider "https://kommander240.ddns.net/dex": Get https://kommander240.ddns.net/dex/.well-known/openid-configuration: x509: certificate signed by unknown authority
This usually translates to the DKP UI (/dkp/kommander/dashboard) access being impaired.
A specific example of this happening was reported by a customer who was using lets encrypt to provide the certificate for the domain. For some as yet unknown reason the cert manager had updated tls.crt in the kommander-traefik-tls secret with the full certificate chain from Lets Encrypt. Cert manager had however failed to remove ca.crt which was the previous root ca (kommander-ca) causing a conflict. The fix in this instance was to delete ca.crt from the secret.
Another example is when the custom-domain and the common name (CN) in the TLS certificate do not match. Both dex-k8s-auth and traefik-forward-auth will complain about it. In the example below, the custom-domain configured is kommander240.ddns.net and the certificate CN is kommander.sadielo.network:
kubectl -n kommander logs -l=app=dex-k8s-authenticator
WARNING: ca-certificates.crt does not contain exactly one certificate or CRL: skipping
WARNING: ca-cert-ca.pem does not contain exactly one certificate or CRL: skipping
2023/06/09 19:51:17 Using config file: /app/configuration/config.yaml
2023/06/09 19:51:17 useClusterHostname: true
2023/06/09 19:51:17 Creating new provider https://kommander240.ddns.net/dex
2023/06/09 19:51:17 Failed to query provider "https://kommander240.ddns.net/dex": Get https://kommander240.ddns.net/dex/.well-known/openid-configuration: x509: certificate is valid for kommander.sadielo.network, not kommander240.ddns.net
WARNING: ca-certificates.crt does not contain exactly one certificate or CRL: skipping
WARNING: ca-cert-ca.pem does not contain exactly one certificate or CRL: skipping
2023/06/09 19:51:05 Using config file: /app/configuration/config.yaml
2023/06/09 19:51:05 useClusterHostname: true
2023/06/09 19:51:05 Creating new provider https://kommander240.ddns.net/dex
2023/06/09 19:51:05 Failed to query provider "https://kommander240.ddns.net/dex": Get https://kommander240.ddns.net/dex/.well-known/openid-configuration: x509: certificate is valid for kommander.sadielo.network, not kommander240.ddns.net
kubectl --kubeconfig sortega-aws-dkp220-workload.conf -n kommander logs -l=app.kubernetes.io/name=traefik-forward-auth -f
time="2023-06-09T19:58:30Z" level=fatal msg="failed to get provider configuration for https://kommander240.ddns.net/dex: Get \"https://kommander240.ddns.net/dex/.well-known/openid-configuration\": x509: certificate is valid for kommander.sadielo.network, not kommander240.ddns.net (hint: make sure https://kommander240.ddns.net/dex is accessible from the cluster)"
time="2023-06-09T19:58:51Z" level=fatal msg="failed to get provider configuration for https://kommander240.ddns.net/dex: Get \"https://kommander240.ddns.net/dex/.well-known/openid-configuration\": x509: certificate is valid for kommander.sadielo.network, not kommander240.ddns.net (hint: make sure https://kommander240.ddns.net/dex is accessible from the cluster)"
As described in DKP documentation, this is how a custom-domain and certificate is defined in the kommander configuration file.
clusterHostname: <mycluster.example.com>
ingressCertificate:
certificate: <certs/cert.pem>
private_key: <certs/key.pem>
ca: <certs/ca.pem>
Where are the certificates and keys are stored in DKP?
Knowing which Kubernetes objects are used to store the certificate and custom-domain is useful when troubleshooting either authentication issues or DKP UI access.
The KommanderCluster host-cluster object and the kommander-traefik-certificate (or kommander-traefik-tls) secret in the kommander namespace are the objects to get inspect when troubleshooting issues with custom-domains and TLS certificate in Kommander.
KommanderCluster
kubectl -n kommander get KommanderCluster host-cluster -oyaml
Certificate Authority
kubectl -n kommander get secret kommander-traefik-tls \
-ojsonpath='{.data.ca\.crt}' | base64 -d | openssl x509 -in /dev/stdin -noout -text
Certificate
kubectl -n kommander get secret kommander-traefik-tls \
-ojsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -in /dev/stdin -noout -text
The openssl tool is very useful to confirm whether a Certificate Authority (CA) was used to sign a certificate as follows:
openssl verify -CAfile ca.crt kommander240.ddns.net.crt