Problem
You receive the following Prometheus alert:
Prometheus is missing rule evaluations due to slow rule group evaluation.
You observe the following error messages in the prometheus-kube-prometheus-stack-prometheus-0-prometheus
pod in the kommander
namespace:
level=warn ts=2022-03-09T11:06:33.439Z caller=manager.go:651 component="rule manager" group=kube-apiserver.rules msg="Error on ingesting out-of-order result from rule evaluation" numDropped=266
Solution
According to this GitHub issue, there was a change to the Helm charts for Prometheus which moved some rules into their own separately grouped rule files, but left the original rules in place, thus creating duplicate rules.
To correct this you need to disable the kubeApiserver
rules in the kube-prometheus-stack-overrides
ConfigMap, acccording to the Kommander documantation:
1. Backup the current ConfigMap (if it exists):
kubectl -n kommander get configmap kube-prometheus-stack-overrides -o yaml > kube-prometheus-stack-overrides.orig.yaml
2. Create a new file called kube-prometheus-stack-overrides.yaml
with the following (merge the file with the existing overrides if needed):
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-prometheus-stack-overrides
namespace: <your-workspace-namespace>
data:
values.yaml: |
---
defaultRules:
rules:
kubeApiserver: false
3. Apply the new configuration:
kubectl -n kommander apply -f kube-prometheus-stack-overrides.yaml
After a few minutes, the configuration will be reloaded. You can check this via the Prometheus UI: the amount of records cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
should decrease from 10 to 5. If that doesn't happen, try restarting the StatefulSet:
kubectl -n kommander rollout restart statefulset prometheus-kube-prometheus-stack-prometheus