When using resources such as GPUs, despite having multiple active tasks using the resource, it is possible for the command `kubectl describe resourcequota -A | grep nvidia` to show 0 under GPU usage. An after effect of this would be those tasks involved be going beyond the supposed allocated quota.
The document [1] discussing resource quotas explains this clearly: "As overcommit is not allowed for extended resources, it makes no sense to specify both requests and limits for the same extended resource in a quota. So for extended resources, only quota items with prefix requests. is allowed for now."
It also mentions the GPU resource: " Take the GPU resource as an example, if the resource name is nvidia.com/gpu, and you want to limit the total number of GPUs requested in a namespace to 4, you can define a quota as follows: requests.nvidia.com/gpu: 4"
A workaround to get these quotas set would be setting a resource quota like in the example below:
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: ResourceQuota
metadata:
name: nvidia-quota
namespace: user1
spec:
hard:
memory: 20Gi
requests.nvidia.com/gpu: "3"
EOF
This should correctly reflect the set quota on the command above.
$ k get resourcequotas -w
NAME AGE REQUEST LIMIT
nvidia-quota 93m memory: 2702Mi/20Gi, requests.nvidia.com/gpu: 1/3
nvidia-quota 93m memory: 7833252352/20Gi, requests.nvidia.com/gpu: 2/3
nvidia-quota 93m memory: 12833252352/20Gi, requests.nvidia.com/gpu: 3/3
nvidia-quota 94m memory: 7833252352/20Gi, requests.nvidia.com/gpu: 2/3
nvidia-quota 94m memory: 2702Mi/20Gi, requests.nvidia.com/gpu: 1/3
Reference:
[1] https://kubernetes.io/docs/concepts/policy/resource-quotas/#resource-quota-for-extended-resources