Known Issue: Pod Networking Not Working On Preprovisioned RHEL 8.4 Nodes in AWS

Steven Sylvester

January 31, 2024 02:26
Updated

Users who have created clusters using the Preprovisioned provider in AWS with Red Hat 8.4 have reported an issue where pod networking does not work as expected.

If you are affected by this issue, you may notice symptoms like CoreDNS pods showing as unhealthy or crashing, and workload pods being killed because of failing readiness checks.

This is due to a NetworkManager bug in RHEL 8.4 where a conflicting entry in a route table hijacks traffic between the host and containers. As a result, the Kubelet is unable to successfully connect to a container's readiness endpoint.

Until this bug is resolved upstream in RHEL 8, the fix is to edit /etc/systemd/system/nm-cloud-setup.service.d/override.conf on each preprovisioned node. If the file does not exist, create it. It should have the following contents:

[Service]
Environment=NM_CLOUD_SETUP_EC2=no