Oftentimes Konvoy operators find that gatekeeper kube-addon pods are oom-killed because the container reaches the limit of memory allocated. When this happens, controllerManager and audit pods resources should be adjusted according to the actual workload in the cluster.
By default the resources allocated for the gatekeeper controller manager [1] and the audit [2] containers are:
controllerManager: resources: limits: cpu: 1000m memory: 512Mi requests: cpu: 100m memory: 256Mi audit: resources: limits: cpu: 1000m memory: 512Mi requests: cpu: 100m memory: 256MiThe recommended way is by defining the following code stanza under the gatekeeper block in the cluster.yaml:
- name: gatekeeper enabled: true values: | config: controllerManager: resources: limits: cpu: 1000m memory: 512Mi requests: cpu: 100m memory: 256Mi audit: resources: limits: cpu: 1000m memory: 512Mi requests: cpu: 100m memory: 256MiTo confirm the changes were correctly applied, the following kubectl command can be executed:
kubectl get deploy gatekeeper-controller-manager -n kubeaddons -ojsonpath='{.spec.template.spec.containers[*].resources}' {"limits":{"cpu":"1","memory":"512Mi"},"requests":{"cpu":"100m","memory":"256Mi"}}
kubectl get deploy gatekeeper-audit -n kubeaddons -ojsonpath='{.spec.template.spec.containers[*].resources}' {"limits":{"cpu":"1","memory":"512Mi"},"requests":{"cpu":"100m","memory":"256Mi"}}References:
[1] https://github.com/mesosphere/charts/blob/master/staging/gatekeeper/values.yaml#L56
[2] https://github.com/mesosphere/charts/blob/master/staging/gatekeeper/values.yaml#L65