Skip to content

Name each serving-stack Helm release after its chart#314

Open
dennis-upbound wants to merge 1 commit into
modelplaneai:mainfrom
dennis-upbound:serving-stack-sa-name-length
Open

Name each serving-stack Helm release after its chart#314
dennis-upbound wants to merge 1 commit into
modelplaneai:mainfrom
dennis-upbound:serving-stack-sa-name-length

Conversation

@dennis-upbound

Copy link
Copy Markdown
Collaborator

Description of your changes

Fixes #215.

provider-helm names each Helm release after the composed resource, whose generated name is <inferencecluster>-<hash>. That name is long, unpredictable, and length-capped, and it flows into every chart's fullname and from there into the names of the resources the chart creates. A chart truncates its own fullname to 63 chars and then appends suffixes — the NVIDIA DRA driver's kubelet-plugin ServiceAccount is <fullname>-service-account-kubeletplugin — so the final name can exceed 63 even when the fullname did not. Kubernetes label values are capped at 63 chars, so any consumer that copies such a name into a label value rejects it. On Cilium clusters the per-pod CiliumIdentity keys on the ServiceAccount name, so the DRA driver's kubelet-plugin pod never gets a network sandbox, never schedules, and no GPU is ever bound.

This isn't specific to the DRA driver. Rendering every chart at a realistic release name shows the DRA driver overflows at essentially any InferenceCluster name, while node-feature-discovery and lws overflow once the name is a little longer, and the kube-prometheus-stack node-exporter subchart overflows the same way (it ignores the chart's fullnameOverride and uses the release name directly). The shared cause is the release name, so the fix belongs there rather than per chart.

_helm_release now sets crossplane.io/external-name to mp-<chart>, which provider-helm uses verbatim as the release name. Each release is named for its chart (mp-dra-driver-nvidia-gpu) instead of the InferenceCluster, so every derived name is short and independent of the InferenceCluster name: the budget below holds for any deployment, not only the ones tested. The mp- prefix reserves a release-name namespace for the releases this function owns; without it a release named cert-manager would adopt or collide with a cert-manager release a user already runs in that namespace, and mp-cert-manager cannot.

Limit Tightest case Length
Helm release name ≤ 53 mp-node-feature-discovery 25
ServiceAccount name ≤ 63 mp-dra-driver-nvidia-gpu-service-account-kubeletplugin 54

The DRA driver is the binding constraint at 54/63: it appends the longest suffix (-service-account-kubeletplugin, 30 chars) to a release name that already contains the chart name. Every other chart is smaller. The release name is stable across chart-version upgrades, so provider-helm upgrades in place rather than treating each version as a new release, and the composed resource keeps its generated, hashed metadata.name, so managed-resource uniqueness in the control plane is unchanged.

This does change release identity. On an existing install provider-helm installs fresh releases under the new mp-<chart> names and orphans the old <inferencecluster>-<hash> releases, so migrating a running stack means a teardown and reinstall. That's acceptable pre-1.0, and the Cilium clusters this fixes are non-functional today regardless.

I have:

  • Read and followed Modelplane's contribution process.
  • Run nix flake check (or ./nix.sh flake check) and made sure it passes.
  • Added or updated tests covering any composition function changes.
  • Signed off every commit with git commit -s.

provider-helm names each release after the composed resource, whose
generated name is <inferencecluster>-<hash>. That name flows into every
chart's fullname and from there into the chart's resource names. A chart
truncates its fullname to 63 chars and then appends suffixes - the NVIDIA
DRA driver's kubelet-plugin ServiceAccount is
<fullname>-service-account-kubeletplugin - so the final name can exceed 63
even when the fullname did not. Kubernetes label values are capped at 63
chars, so a consumer that copies the name into a label value rejects it: on
Cilium clusters the per-pod CiliumIdentity keys on the ServiceAccount name,
so the DRA driver's kubelet-plugin pod never gets a network sandbox and no
GPU is bound. node-feature-discovery, lws, and the kube-prometheus-stack
node-exporter subchart overflow the same way once the name is long enough.

_helm_release now sets crossplane.io/external-name to mp-<chart>, which
provider-helm uses verbatim as the release name. Naming each release for
its chart makes every derived name short and independent of the
InferenceCluster name; the longest is the DRA driver's kubelet-plugin
ServiceAccount at 54 of 63 chars. The mp- prefix reserves a release-name
namespace for releases this function owns, so an mp-cert-manager release
can't adopt or collide with a cert-manager release a user already runs in
that namespace. The name is stable across chart-version upgrades, so
provider-helm upgrades in place. The composed resource keeps its generated
metadata.name, so control-plane uniqueness is unchanged.

A new test runs the function with an oversized InferenceCluster name and
asserts every release is named mp-<chart>, so the release name - and every
name a chart derives from it - stays within the 63-char label limit no
matter how long the InferenceCluster name is.

This changes release identity. On an existing install provider-helm
installs fresh releases under the new names and orphans the old
<inferencecluster>-<hash> releases, so migrating a running stack needs a
teardown and reinstall.

Fixes modelplaneai#215.

Signed-off-by: Dennis Ramdass <dennis@upbound.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ServingStack NVIDIA DRA driver fails on Cilium clusters: service-account name exceeds 63-char CiliumIdentity label limit

1 participant