Skip to content

fix(sdk,test): cascade-delete ephemeral chain validators (Bugbot high-sev follow-up to #428)#429

Merged
bdchatham merged 2 commits into
mainfrom
fix/wsi-network-deletion-policy-cascade
Jun 23, 2026
Merged

fix(sdk,test): cascade-delete ephemeral chain validators (Bugbot high-sev follow-up to #428)#429
bdchatham merged 2 commits into
mainfrom
fix/wsi-network-deletion-policy-cascade

Conversation

@bdchatham

Copy link
Copy Markdown
Collaborator

Follow-up to #428 — Bugbot high-severity finding

Bugbot flagged on #428 (verified valid): the integration harness's teardown orphans validator SeiNodes.

Root cause (confirmed in-tree)

  • provision creates the genesis SeiNetwork; teardown deletes it + the RPC SeiNodes.
  • The CRD defaults DeletionPolicy=Retain (+kubebuilder:default=Retain, seinetwork_types.go:98), and under Retain the controller strips the validator children's ownerRef (removeOwnerRef, internal/controller/seinetwork/nodes.go:301) so they survive network deletion.
  • The validators are controller-created → they never carry sei.io/harness-run → neither t.Cleanup nor the label-GC sweep reaps them. Leaks 4 validators + PVCs every run in the shared nightly namespace.

Fix

  • Add DeletionPolicy to the SDK NetworkSpec (plain string + DeletionDelete/DeletionRetain constants — core stays stdlib-only; threaded into renderNetwork). Same additive one-way-door pattern as Labels (feat(test): WS-I sei integration suite — TestBenchmark provision spine [DRAFT] #428).
  • The harness sets DeletionDelete — an ephemeral chain cascade-deletes its validators (+ PVCs) on teardown. Correct for a throwaway test chain (distinct from production fleet nodes, which stay Retain).
  • Locked by a render_test assertion that DeletionPolicy threads through.

Verification

go build ./... clean · go test ./sdk/... pass (incl. new assertion) · go test -c -tags integration compiles · golangci-lint 0 issues.

🤖 Generated with Claude Code

…-sev)

Bugbot (PR #428) caught a real leak: provision creates the genesis
SeiNetwork, and teardown deletes that network — but the CRD defaults
DeletionPolicy=Retain, and the controller strips the validator children's
ownerRef under Retain (removeOwnerRef), so deleting the network ORPHANS the
controller-created validator SeiNodes. Those validators never carry
sei.io/harness-run (the harness doesn't create them), so neither t.Cleanup
nor the label-GC sweep reaps them — leaking 4 validators + PVCs per run in
the shared nightly namespace.

Add DeletionPolicy to the SDK NetworkSpec (string + DeletionDelete/Retain
constants, stdlib-only core; threaded into renderNetwork). The integration
harness sets DeletionDelete so an ephemeral chain cascade-deletes its
validators (+ PVCs) on teardown. Locked by a render_test assertion.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@cursor

cursor Bot commented Jun 23, 2026

Copy link
Copy Markdown

PR Summary

Low Risk
Additive SDK field with harness-only default for test chains; production callers omit it and keep CRD Retain behavior.

Overview
Fixes validator SeiNode + PVC leaks when the integration harness tears down a SeiNetwork: the CRD default DeletionPolicy=Retain orphans controller-created validators that never get sei.io/harness-run, so normal cleanup and label-GC cannot reap them.

The SDK gains NetworkSpec.DeletionPolicy (DeletionDelete / DeletionRetain string constants) and renderNetwork maps it to spec.deletionPolicy (empty string keeps Retain). provision in the integration harness sets DeletionDelete for ephemeral nightly chains. A render_test assertion locks the mapping.

Integration package comments in benchmark_test.go and harness_test.go are trimmed; behavior is unchanged aside from the deletion policy.

Reviewed by Cursor Bugbot for commit d9d6b00. Bugbot is set up for automated code reviews on this repo. Configure here.

Two-lens comment-standards review (idiom D10 + prose dual-audience):
- strip meta/ID cruft from code comments (review-tool + design-step/decision
  IDs belong in the PR, not source)
- drop migration-history framing (present-state only)
- fix one drift: TestBenchmark doc claimed seiload drive/report the body skips
- trim the agent-verbose package doc + de-duplicate the DeletionPolicy
  rationale to a single canonical home

No behavior change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bdchatham

Copy link
Copy Markdown
Collaborator Author

Comment-standards expert review applied (idiom D10 + prose register)

Per request, ran a two-lens review scoped to comment register across the WS-I harness + SDK surface. Both converged; all findings applied in d9d6b00:

  • stripped meta/ID cruft from code comments (Bugbot high-sev, WS-I step N, decision D*) — that context lives here in the PR, not in source
  • dropped migration-history framing (replaces…, used to pass…) → present-state only
  • fixed one drift (gating): TestBenchmark's doc claimed seiload drive/report the body t.Skips
  • trimmed agent-verbose package doc + de-duplicated the DeletionPolicy rationale to one canonical home

Vetted-clean and kept: the // -> spec.X field mappings, blank-import rationale, SIGTERM/why comments — the standard is why-not-what + crisp, not fewer comments.

Durable follow-up the idiom lens recommended: graduate this into /pr-quality as a grep gate (Bugbot|WS-I|xreview|decision D[0-9] in *.go comments outside TODO() so it's mechanically enforced. Will forward to Tide.

@bdchatham bdchatham merged commit 5db7aca into main Jun 23, 2026
5 checks passed
@bdchatham bdchatham deleted the fix/wsi-network-deletion-policy-cascade branch June 23, 2026 00:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant