feat(test): WS-I sei integration suite — TestBenchmark provision spine [DRAFT]#428
Conversation
…n spine Go-native nightly harness as plain `go test` targets (WS-I). Stage 1: the TestBenchmark load suite's provisioning spine — CreateNetwork (4 validators) + N standalone RPC SeiNodes via the sei SDK in-process, each waited to Running + caught-up + EVM-serving, torn down via t.Cleanup. Conventions / isolation: - test/integration/*_test.go, //go:build integration: Go's _test.go rule guarantees zero test code links into any production binary (controller / seid / seitask); the build tag hides it from default `go test ./...`. - ships only via `go test -c -tags integration` as a standalone binary in its own image, run by one in-cluster CronJob per target (-test.run TestX). - imports ONLY sdk/sei (+ k8s provider) — no internal/seitask or internal/taskruntime — so the seitask runner deletes wholesale once the four targets (Benchmark/ChaosSuite/ChainUpgrade/Release) land. seiload + chaos are decoupled units the suite will APPLY, not construct (seiload from its own manifest; chaos from platform-owned fault CRs). The seiload drive + S3 report is the next increment (marked TODO + t.Skip). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…model hardening Resolves the 3-lens xreview findings on the integration-suite foundation: - F1 (correctness, all 3 lenses): the sei.io/harness-run GC label was declared but never stamped (SDK specs had no Labels field; renderNetwork stamped none) — the sole abnormal-exit reaper selected nothing. Add Labels to NetworkSpec/NodeSpec, thread into renderNetwork (was unlabeled) + renderNode (caller labels merge UNDER the canonical role/seinetwork, which win on collision); provision stamps runLabelKey=runID on the network + every node. Locked by render_test label assertions. - F2 (correctness): a -test.timeout breach panics and bypasses t.Cleanup; derive ctx via signal.NotifyContext(SIGTERM) so the activeDeadlineSeconds grace period triggers teardown before SIGKILL. (-test.timeout 0 CronJob requirement recorded in the LLD.) - F5 (systems): per-gate t.Logf progress so a stall is localizable in real time (which node, which gate) instead of one terminal error. - idiom advisories: env->envOr, for-range, Skipf naming what was provisioned, honest runLabelKey comment. F3/F4 (-test.timeout default, -test.run no-match false-green) are recorded as CronJob/CI run-model requirements (LLD) — enforced with the platform wiring. SDK Labels is a Brandon-approved one-way-door addition. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
/xreview round-1 — RESOLVED (3 blinded lenses: idiom · systems · k8s-dissenter)Dispatched iterative review on the foundation before more accretes. The dissenter did not ratify on first pass; all findings now resolved in
Ratified by all three: partial-provision teardown (append-before-wait), probe client, sequential provisioning, SDK consumption. SDK |
PR SummaryMedium Risk Overview Introduces Reviewed by Cursor Bugbot for commit bd15399. Bugbot is set up for automated code reviews on this repo. Configure here. |
| if ch.network != nil { | ||
| if err := ch.network.Delete(ctx); err != nil { | ||
| errs = append(errs, fmt.Errorf("delete network %q: %w", ch.network.Name(), err)) | ||
| } |
There was a problem hiding this comment.
Teardown orphans validator SeiNodes
High Severity
provision creates a genesis SeiNetwork without deletionPolicy: Delete, and teardown only deletes RPC SeiNodes plus that network. With the API default Retain, deleting the network orphans its validator children instead of removing them. Those controller-created validators never get sei.io/harness-run, so neither normal t.Cleanup nor the documented label-GC sweep reaps them, leaking CRs and workloads in the shared nightly namespace on every run.
Reviewed by Cursor Bugbot for commit 3997e2c. Configure here.
lint flagged "run-xyz" at 4 occurrences in render_test. Extract testRunLabel/testRunID constants alongside the other k8s test fixtures. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
bugbot run |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
1 issue from previous review remains unresolved.
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit bd15399. Configure here.
…-sev follow-up to #428) (#429) * fix(sdk,test): cascade-delete ephemeral chain validators (Bugbot high-sev) Bugbot (PR #428) caught a real leak: provision creates the genesis SeiNetwork, and teardown deletes that network — but the CRD defaults DeletionPolicy=Retain, and the controller strips the validator children's ownerRef under Retain (removeOwnerRef), so deleting the network ORPHANS the controller-created validator SeiNodes. Those validators never carry sei.io/harness-run (the harness doesn't create them), so neither t.Cleanup nor the label-GC sweep reaps them — leaking 4 validators + PVCs per run in the shared nightly namespace. Add DeletionPolicy to the SDK NetworkSpec (string + DeletionDelete/Retain constants, stdlib-only core; threaded into renderNetwork). The integration harness sets DeletionDelete so an ephemeral chain cascade-deletes its validators (+ PVCs) on teardown. Locked by a render_test assertion. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(test,sdk): tighten comment register per expert review Two-lens comment-standards review (idiom D10 + prose dual-audience): - strip meta/ID cruft from code comments (review-tool + design-step/decision IDs belong in the PR, not source) - drop migration-history framing (present-state only) - fix one drift: TestBenchmark doc claimed seiload drive/report the body skips - trim the agent-verbose package doc + de-duplicate the DeletionPolicy rationale to a single canonical home No behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>


WS-I Step 1 (foundation) — Go-native nightly harness as
go testtargetsFirst increment of the Go-native test harness that replaces the Chaos-Mesh Workflow DAG + seitask Task pods + workflow-vars ConfigMap. Draft — lands the architecture + provisioning spine; the seiload drive + S3 report is the next increment (marked TODO +
t.Skip).What's here
test/integration/harness_test.go— shared machinery:spec/chain(local Go state, replacing workflow-vars),provision(SDK in-process: CreateNetwork 4 validators + N RPC SeiNodes, each waited Running→caught-up→EVM-serving),teardownviat.Cleanup,sei.io/harness-runlabel const, env gate.test/integration/benchmark_test.go—TestBenchmark(load suite).Architecture decisions (WS-I LLD)
go testtargets in_test.go(not a CLI binary) — Go's_test.gorule guarantees zero test code in any production binary;//go:build integrationhides it from default CI. Verified:go build ./...clean, no integration code incmd/deps.go test -cimage, one CronJob per target (-test.run TestX). Targets: TestBenchmark / TestChaosSuite / TestChainUpgrade / TestRelease.sdk/sei(+ k8s provider).Verification
gofmtclean ·go build ./...clean ·go vet -tags integration ./test/integrationclean ·go test -c -tags integration→ binary exposesTestBenchmark· skips cleanly withoutSEI_NODE_CLUSTER.Next increment
seiload as a decoupled unit (apply its own manifest w/
evmEndpoints(), stampedharness-run) → wait → read S3 report → assert TPS. Then mark ready + Coral/Bugbot/CI gate.🤖 Generated with Claude Code