Skip to content

Add opt-in preflight admission readiness check before kubeadm upgrade apply #13139

@felipe88alves

Description

@felipe88alves

What would you like to be added

I'd like to add an opt-in preflight check that verifies the API server's admission pipeline is ready before kubeadm upgrade apply runs on the first control plane node.

We run kubespray across multiple production clusters and have been hitting intermittent [ERROR CreateJob] failures during upgrades. Our clusters use ValidatingAdmissionPolicy with a CRD-based paramKind, and after the API server restarts mid-upgrade, the VAP param controller's RESTMapper takes ~30 seconds to refresh its CRD discovery cache (kubernetes/kubernetes#124237). During that window, kubeadm upgrade apply's preflight health-check Job gets rejected by the stale admission controller.

Here's the relevant error from a real upgrade run (sanitized):

TASK [kubernetes/control-plane : Kubeadm | Upgrade first control plane node]
...
[ERROR CreateJob]: could not create Job "upgrade-health-check-xxxxx" in the namespace
"kube-system": jobs.batch "upgrade-health-check-xxxxx" is forbidden:
ValidatingAdmissionPolicy '<vap-policy>' denied request: failed to configure policy:
failed to find resource referenced by paramKind: '<crd-group/version, Kind=ParamCRD>'

The VAP param controller can't resolve the CRD paramKind reference because its RESTMapper cache is stale after the API server restart. The Job itself is fine; the admission pipeline is temporarily broken.

The existing check-api.yml only checks /healthz, which returns 200 while the admission pipeline is still broken. A potential fix would be to add an opt-in task between check-api.yml and kubeadm upgrade apply that dry-runs a batch/v1 Job (the same resource kind kubeadm creates for its preflight) through the server-side admission pipeline, retrying until admission controllers are ready. This would follow existing patterns like the configurable health-check approach from #12448/PR #12452.

I have a working implementation on our fork and would be happy to submit a PR.

Why is this needed

I've been hitting this failure intermittently across our production clusters during Kubernetes upgrades with kubespray. Our clusters use ValidatingAdmissionPolicy with CRD-based paramKind references, and the upgrade fails on the first control plane node, where kubeadm upgrade apply is called.

The root cause is a timing issue in the upgrade flow:

  1. Tasks in kubeadm-setup.yml (audit policy, admission control configs, tracing) and kubeadm-fix-apiserver.yml (component kubeconfigs) notify the Restart apiserver / restart kubelet handlers
  2. The handlers fire and restart the API server
  3. The VAP param controller's RESTMapper cache is invalidated and takes ~30 seconds to refresh (kubernetes/kubernetes#124237 - still open, no fix merged; related: kubernetes/kubernetes#122658)
  4. check-api.yml passes because /healthz returns 200 while the admission pipeline is still stale
  5. kubeadm upgrade apply creates its preflight health-check Job, which gets rejected with HTTP 422
  6. Upgrade fails with [ERROR CreateJob]

In this case, we traced the failure to the VAP param controller failing to resolve a custom CRD paramKind after the restart. API server audit logs showed the first 422 error ~30 seconds before kubeadm upgrade apply ran, consistent with the RESTMapper refresh window.

This is different from #12006/#12009, which fixed secondary control plane nodes by switching to kubeadm upgrade node (PR #12015). Our issue is on the first control plane node where kubeadm upgrade apply is correct. It's also different from #12448 (PR #12452), which made the /healthz retry count configurable. The /healthz check passes fine, it's the admission pipeline that's broken.

I think anyone running VAP with CRD paramKind references will eventually hit this during upgrades. The preflight check adds a direct probe of the admission pipeline to catch the staleness window. It should be deprecated once kubernetes/kubernetes#124237 is fixed in all supported Kubernetes versions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureCategorizes issue or PR as related to a new feature.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions