Services pending due to invalid metallb configuration
Overview/Background
When deploying an on-premise Konvoy cluster, metallb is used for provisioning of Service type LoadBalancer objects. The addresses that will allocated to the Service type LoadBalancer obejects are specified in the addon's values.configInline.address-pools.addresses in your cluster.yaml:
kind: ClusterConfiguration
apiVersion: konvoy.mesosphere.io/v1beta1
spec:
  addons:
    addonsList:
    - name: metallb
      enabled: true
      values: |-
        configInline:
          address-pools:
          - name: default
            protocol: layer2
            addresses:
            - 10.0.50.25-10.0.50.50
The syntax for each item in the addresses list must be either an IP range or CIDR. For example:
If an invalid syntax is specified here, any Service type LoadBalancer may be stuck in a pending state:
$ kubectl get svc -A | grep 'TYPE\|LoadBalancer'
NAMESPACE      NAME                      TYPE           CLUSTER-IP    EXTERNAL-IP   PORT(S)
istio-system   istio-ingressgateway      LoadBalancer   10.0.50.123        15020:30829/TCP,80:31380/TCP,443:31390/TCP,31400:31400/TCP,15029:32279/TCP,15030:31933/TCP,15031:31698/TCP,15032:31726/TCP,15443:30690/TCP   13d
kubeaddons     traefik-kubeaddons        LoadBalancer   10.0.50.12         80:30657/TCP,443:31686/TCP,8080:31465/TCP                                                                                                    13d
velero         minio-lb                  LoadBalancer   10.0.0.123         9000:32606/TCP
Additionally, you will observe the following warnings in the metallb controller and/or speaker logs:
$ kubectl logs -n kubeaddons -l app=metallb,component=controller
W0102 12:07:02.324798       1 reflector.go:302] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: watch of *v1.ConfigMap ended with: too old resource version: 11283 (12569)
{"caller":"k8s.go:361","configmap":"kubeaddons/metallb-kubeaddons","error":"parsing address pool #1: invalid CIDR \"10.5.124.101\" in pool \"default\"","event":"configStale","msg":"config (re)load failed, config marked stale","ts":"2020-01-02T12:07:03.328566634Z"}
Solution
To resolve this issue, ensure that you have specified a valid IP range or CIDR in your cluster.yaml and re-run `konvoy up` to update your cluster.