feat(dataset): stage Pod + tar-over-exec stream (PR-b of #151)#9
Merged
Conversation
Completes Phase 3 (tracebloc/client#151). PR-a delivered the no-op-safe pre-flight; this PR-b plugs in the actual file staging — ephemeral Pod, client-go SPDY executor, tar stream, SIGINT-safe cleanup. ## What lands internal/push/ (5 new files): - pod.go Stage Pod spec builder + Create/Wait/Delete lifecycle. Alpine 3.20 pinned by digest, PSA- restricted security context, activeDeadline- Seconds=600 in-cluster self-kill defense-in-depth, random 8-hex-char suffix for parallel-push collision avoidance. - stream.go Executor interface (SPDYExecutor in prod, fakeExecutor in tests). StreamLayout wires a tar.Writer through an io.Pipe to the exec stdin, running `tar -xf - -C /data/shared/<table>` in the Pod. Captures remote stderr into the error surface so customers see "no space left on device" verbatim, not a generic exec failure. - progress.go schollz/progressbar/v3 wrapper, TTY-detected via golang.org/x/term. No-op in CI / non-TTY output so the bar doesn't pollute log streams. - orphan.go Scans for stage Pods labeled managed-by=tracebloc-cli older than 5 min; renders an actionable warning with the kubectl-delete command. Warn-only in v0.1 (auto-cleanup is v0.2 — can't distinguish "crashed previous push" from "still-running parallel push from another workstation"). - stage.go The Stage() orchestrator. Order: orphan scan → CreateStagePod → defer DeleteStagePod with a FRESH context (SIGINT-safe — defers fire even when the parent ctx is cancelled) → WaitForReady → StreamLayout → cleanup. On any error past Create, the deferred delete still runs. internal/cli/dataset.go: - Replace the exit-6 stub with push.Stage(...) wired to cluster-discovered ns/PVC/SA + the new SPDYExecutor. - Add v0.1 category gate (Medium-1 from PR-a self-review). Runs BEFORE schema validation so unsupported-but-schema- valid categories like tabular_classification get the actionable "v0.1 supports only image_classification" message instead of the schema's confusing "missing property 'schema'". - Add --stage-pod-image flag for air-gapped customers. - Update exit-code doc: drop exit 6 (PR-b stub), add exit 7 (staging-step failed, distinct from pre-flight 3/4). internal/push/stage_test.go covers Medium-2 from PR-a review: TestStage_IngestorSANameFlowsToPod pins that the --ingestor-sa override actually lands on the stage Pod's ServiceAccountName. ## Tricky bits Pipe deadlock guard: when the executor returns early (ctx cancel, remote tar dies immediately), the tar-write goroutine would block on pipe.Write forever because nothing reads. Fix: close the pipe Reader after exec.Exec returns, which unblocks the writer with io.ErrClosedPipe. Caught by TestStage_CancelledContext_StillCleansUp. SIGINT safety: the deferred DeleteStagePod uses a fresh ctx with a 30s deadline rather than the (possibly cancelled) parent ctx. Without this, Ctrl-C right after pod-create leaks an orphan Pod. activeDeadlineSeconds=600 is the in-cluster backstop for the truly-pathological case (hard-kill, network partition between laptop and cluster). errcheck cleanup: writeLayoutTar uses a named return + deferred close that promotes a tar trailer-write error if and only if the function otherwise succeeded — silent truncation would be much worse than a noisy error. ## Test plan - [x] go vet ./... — green - [x] go test -race -cover ./... — green (push 79.8%, cluster 83.2%, schema 80.7%, cli 52.2%) - [x] gofmt -s -l . — no drift - [x] errcheck ./... — green - [x] go build — binary builds - [x] Local smoke: --category=tabular_classification → exit 2 with "v0.1 supports only image_classification" message (the new gate's actionable diagnostic) - [ ] Real EKS smoke (manual; out of CI scope) ## Open items deferred Per PR-a review's Low/Nit list: - runDatasetPushArgs.Context shadowing context.Context - printPushPreflight rendering AccessModes with %v - testutil package consolidation for imgcDir / imgcLayout - dataset_test kubeconfig path traversal via subtests - Discover hint for nested-image-dirs These are tracked for v0.2 cleanup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
👋 Heads-up — Code review queue is at 21 / 8 Above the WIP limit. The team convention is to review existing PRs before opening new work. Open PRs currently in Code review (oldest first):
Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.) |
…+Med) Two real bugs caught by Bugbot's re-review on PR-b: ## 1. High — Stage Pod name breaks DNS-1123 BuildStagePodSpec used the raw table name as a Pod name segment. ValidateTableName accepts [A-Za-z0-9_]+ (MySQL identifier rules), but K8s Pod names follow DNS-1123 subdomain rules: lowercase alnum + hyphen, must start/end alnum. The canonical example throughout tracebloc docs (cats_dogs_train, snake_case) would fail Pod create post-pre-flight — worst-of-both-worlds UX (pre-flight says "good!" then create fails). Fix: transform the table name into a DNS-1123-safe segment for the Pod name only — lowercase, _→-, trim leading/trailing hyphens, cap length, fallback to "tbl" for the pathological all-underscore case. The original verbatim name still goes in the tracebloc.io/table label so orphan warnings stay traceable. TestDNS1123SafeTableSegment covers cats_dogs → cats-dogs, MyTable → mytable, _leading_underscore → leading-underscore, _ → tbl, 50-char → 30-char truncation. Each result is cross-validated against an inline DNS-1123 regex check. ## 2. Medium — SIGINT skips Pod cleanup push.Stage documented SIGINT-safe cleanup via context cancellation + fresh-ctx deferred DeleteStagePod, but cmd/tracebloc/main.go called Execute() without signal.NotifyContext. Default Go runtime behaviour on SIGINT is to exit without running defers — so every Ctrl-C during staging leaked an orphan Pod until activeDeadlineSeconds (10 min) fired. The docstring was false advertising. Fix: signal.NotifyContext(ctx, SIGINT, SIGTERM) in main, passed to ExecuteContext. First Ctrl-C cancels the cobra ctx → push.Stage's in-flight HTTP cancels → deferred cleanup runs (with its own fresh ctx, also kept). Second Ctrl-C does normal hard-kill (stop() in defer un-registers the handler) — important if the cleanup itself hangs. Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two new findings from Bugbot's re-review on commit 35ae044: ## Medium — WaitForStagePodReady polled through terminal errors The poll callback returned (false, nil) for EVERY Pods.Get error, which meant Pod-deleted-out-of-band (NotFound) or RBAC-revoked-mid- push (Forbidden) waited the full 60s timeout despite the situation being unrecoverable. Customer-facing impact: 60s of "Waiting for stage Pod to be Ready..." for an error that should surface in <1s. Fix: classify the error. NotFound + Forbidden terminate the poll immediately (return (false, err)); everything else stays transient (network blip, brief API unavailability). The apierrors.Is{NotFound,Forbidden} helpers were already imported from the previous fixes, so this is a single-block change. Two new tests pin the contract via t.Now() bounds — if a regression makes either case transient again, the test waits the full 60s and the assertion catches it. ## Low — BuildStagePodSpec swallowed crypto/rand error `suffix, _ := randomSuffix(4)` — if crypto/rand failed (rare but possible on systems with exhausted entropy), the suffix was empty and the Pod name became `tracebloc-stage-cats-dogs-` with a bare trailing hyphen. DNS-1123 rejects that → opaque API error from kube-apiserver instead of a clear local diagnostic. Fix: change BuildStagePodSpec signature to (*Pod, error) and propagate the rand failure. CreateStagePod (the prod caller) already returns an error; tests use a mustBuildStagePod helper that t.Fatal's on the rare path. 8 test call sites updated via find-and-replace, plus the override-image test by hand. The third inline comment ("SIGINT skips pod cleanup", bug_id 24ab9106) is a stale carry-over of the prior-round finding I already fixed in 35ae044 — same bug_id, GitHub auto-shifted the anchor onto the new commit. Replied in-thread pointing at the fix. Bugbot's review-body confirms "2 NEW findings", not 3. Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round 4 of Bugbot fixes on PR-b. Two new findings on commit 7baa321 (plus 3 stale carry-overs from earlier rounds — same bug IDs as already-fixed findings, GitHub keeps re-anchoring them onto each new commit; Bugbot's review-body confirms "2 NEW findings"). ## Medium-Security — symlink images bypassed size caps The vulnerability: Discover sized image entries via DirEntry.Info() (Lstat-equivalent: returns the symlink's own ~100-byte size). writeTarFile read them via os.Stat + os.Open (which follow symlinks). So a customer with `images/evil.jpg` symlinked to `/var/log/system.log` (multi-GB) or `~/.ssh/id_rsa` could: 1. Pass Discover's MaxSingleFileBytes + MaxTotalBytes caps trivially (cap is 1 GiB; the symlink itself is ~100 bytes). 2. Stream the target's FULL contents into the cluster PVC, where the cluster admin can read them via `kubectl exec`. This is both a size-cap bypass AND an arbitrary-local-file disclosure. Worse, it can fire unintentionally — a customer who ran `ln -s ~/datasets/big-images images` to share data across projects would silently stream gigabytes during what they thought was a small test push. Fix in three layers: 1. walk.go: os.Stat → os.Lstat for the labels.csv check (so the mode bits include ModeSymlink) + new rejectSymlink() helper called for both labels.csv and each image entry. v0.1 rejects symlinks outright with a clear "v0.1 doesn't allow symlinks (security: ...)" message pointing at `cp -L` for materializing and the v0.2 cloud-source story for distributed data. 2. stream.go: writeTarFile's os.Stat → os.Lstat + symlink check too. Defense-in-depth — Discover is the primary fix, but a future refactor that calls writeTarFile directly (e.g. a new resume-from-partial-transfer code path in v0.2) would re-introduce the hole without this layer. 3. New tests pin both rejections (labels.csv-as-symlink + image- as-symlink). Skipped on Windows where symlinks require admin. ## Medium — WaitForStagePodReady didn't short-circuit on Phase=Failed Companion to the NotFound/Forbidden short-circuit landed in the previous commit. The poll positively-terminated on Ready=True but never negatively-terminated on Phase=Failed (container crashed at startup: PSA rejection, image crashloop, OOM at container init) or Phase=Succeeded (stage container's sleep exited unexpectedly). Same 60s timeout symptom for an outcome that's actually decided in <5s. Fix: in the poll body, check Phase after the conditions loop; return a structured error including the podReadyTimeoutHint output (container-status reason + message) so the customer sees "terminated in phase Failed (OOMKilled — ...)" instead of a generic timeout. New test TestWaitForStagePodReady_FailedPhaseIsTerminal pins the <3s elapsed contract for an OOMKilled startup. Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ress defer Round 5 of Bugbot fixes on PR-b. The "5 potential issues" review-body count corresponds to 5 NEW findings; the other 5 inline comments are stale carry-overs of bug IDs already fixed in earlier rounds (24ab9106, 15e29a3e, 258869d3, 5c577688, 7fcf137f — GitHub auto-anchors them onto each new commit). ## Medium — Symlinked images/ directory bypassed all checks Round 4 fixed symlinked FILES inside images/ (Lstat on each entry). It missed the case where images/ ITSELF is a symlink — `ln -s /etc images` passed os.Stat (which follows symlinks), the IsDir check succeeded, and the loop walked /etc. Same disclosure pattern as the round-4 finding, one layer up. Fix: os.Stat → os.Lstat on imagesDir + rejectSymlink before the IsDir check. Symmetric with the labels.csv treatment from round 4. New test TestDiscover_RejectsSymlinkedImagesDir pins the rejection. ## Medium — Table label could exceed Kubernetes 63-char limit ValidateTableName accepted [A-Za-z0-9_]+ without a length cap, but the stage Pod's tracebloc.io/table label carries the raw name. K8s label values cap at 63 chars (DNS-1123 label rule), so a 100-char name passed pre-flight then failed Pod creation with an opaque label-validation error. Fix: cap at 63 chars (also matches MySQL's 64-char identifier limit with 1 char of headroom for any future ingestion-run-id suffix). New MaxTableNameLength const + boundary test pin the cap. ## Medium — Re-push left stale PVC files The remote command was `mkdir -p <dest> && tar -xf - -C <dest>`. If a previous push had 3 images and the new push has 2, the PVC ends up with the union — and labels.csv now disagrees with the stage directory, silently breaking ingestion. Fix: `rm -rf <dest> && mkdir -p <dest> && tar -xf - -C <dest>`. Safe because dest = StagedPrefix(table) = /data/shared/<table> and ValidateTableName has ensured `table` is a single safe segment, so rm -rf can only nuke that one per-table subdir, never the parent /data/shared or sibling tables. TestStreamLayout_RemoteCommand updated to assert the rm AND its ordering before mkdir. ## Low — Progress bar not Finish'd on early Stage failure Stage's deferred Finish lives inside StreamLayout, so a failure earlier in the lifecycle (CreateStagePod, WaitForStagePodReady) returned without clearing the TTY bar. Visual artifact on the customer's terminal after a Pod-create failure. Fix: defer progress.Finish() at the construction site in runDatasetPush. Schollz Finish is idempotent so double-call on happy path is a no-op. ## Hard-link bypass — documented, not fixed Bugbot also flagged that hard links to outside files aren't rejected. Filesystem mode bits don't distinguish a hard link from a regular file the way ModeSymlink distinguishes symlinks, and a high-Nlink check has too many false positives. The implicit security boundary is the CLI's process-level read permissions: a customer can only hard-link files they already have read access to, so this isn't a privilege escalation. v0.2 may add openat(O_NOFOLLOW) sandboxing if customers need harder isolation. Documented in rejectSymlink's docstring as a known limitation alongside the v0.2 plan. Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round 6. Review-body says "2 NEW + 3 unresolved" — Bugbot itself is
flagging the 9 carry-overs at this point. Two new findings, both
real:
## High — Pod activeDeadlineSeconds (600s) cut too close
The deadline starts at Pod CREATION, but image pull (up to 60s) +
WaitForStagePodReady ceiling (60s) + worst-case stream (1 GiB at
2 MB/s ≈ 8.5 min) eat the budget. Near-cap customers on slow
uplinks could see the kubelet terminate mid-transfer with no
useful diagnostic.
Fix: bump StagePodActiveDeadline from 600 → 1800 (30 min). Budget
breakdown in the const's docstring; comfortable margin for
variance. Cost is "an idle alpine Pod with sleep idles for ~22min
after a successful push" — ~5 MiB cluster RAM, zero CPU, deleted
seconds later by the defer'd cleanup anyway.
Test pin: TestStagePodActiveDeadline_CoversFullLifecycle asserts
the floor at 1500s so a future regression to 600 is caught.
## Medium — Windows tar paths used backslashes
`filepath.Join("images", filepath.Base(abs))` produces
`images\file.jpg` on Windows. USTAR / POSIX tar requires forward
slashes; the Linux stage Pod's `tar -xf` either rejects or
extracts as a flat-named file (collapsing the images/ subdir
the ingest spec expects). Breaks the Windows-built CLI.
Fix: switch to `path.Join` for the tar HEADER name. Keep
`filepath.Join` everywhere else (where the OS-native separator
is the right thing). Test pin
TestStreamLayout_TarPathsAreForwardSlash asserts no backslashes
in any tar entry name, regardless of host OS.
## Carry-overs
Bugbot's review body now explicitly says "There are 3 total
unresolved" — it knows the carry-overs aren't on the new commit
anymore. Counts I'm tracking as known-fixed across earlier
rounds: 24ab9106 (SIGINT, r3), 15e29a3e (Pod-Get spin, r3),
258869d3 (rand error, r3), 5c577688 (Failed Pod, r4), 7fcf137f
(symlink files, r4), de426248 (hard links, r5 — documented
limitation), cd462de9 (symlink dir, r5), a8e9e5c7 (label len,
r5), 0b296807 (progress finish, r5).
Local: vet, test -race -cover, gofmt -s, errcheck — all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…re-push
Round 7 of Bugbot fixes on PR-b. Review-body confirmed "4 NEW + ...".
All four real, all four fixed:
## Medium — Wait errors mislabeled as timeout
WaitForStagePodReady wrapped EVERY poll exit (NotFound, Forbidden,
Phase=Failed, ctx.Canceled from SIGINT) in "did not become Ready
within 60s." Misleading: the customer who hit Ctrl-C 2 seconds
into the wait would see a "timed out" diagnostic.
Fix: branch on errors.Is(err, context.DeadlineExceeded). True
timeout keeps the "did not become Ready within %s" wording;
early-exit cases surface as "did not reach Ready state: %w" with
the actual cause front and center.
## Medium — Orphan-delete hint would nuke parallel pushes
FormatOrphansWarning's kubectl-delete command used the
`managed-by=tracebloc-cli,component=stage-pod` label selector,
which matches every stage Pod in the namespace including
LEGITIMATE RUNNING ONES from parallel pushes (other workstations,
or even this one's just-started Pod). A customer copy-pasting the
suggested cleanup could silently kill someone else's in-progress
push.
Fix: list specific orphan Pod names in the delete command —
`kubectl delete pod -n <ns> <name1> <name2> ...`. Test regression-
pins the absence of the label-selector form.
## Medium — Re-push deleted before transfer succeeded
Round 5's hermetic-re-push fix (rm -rf $DEST && mkdir + tar)
satisfied "no stale files" but not "preserve on failure." A
tar mid-stream failure (Ctrl-C, network drop, container OOM) left
the customer with NOTHING on the PVC — the previous push's data
already nuked, the new push aborted before completing.
Fix: extract to <dest>.staging, swap on success:
rm -rf <dest>.staging # cleanup any prior failure
mkdir -p <dest>.staging && tar -xf - -C <dest>.staging
rm -rf <dest> && mv <dest>.staging <dest> # swap on success
rm -rf <dest>.staging # defensive cleanup
The shell's && sequencing means swap only fires if tar succeeded.
Lost the prior `exec /bin/tar` micro-optimization (can't exec
mid-chain) — fine, the shell process is tiny. The window between
the destination rm and the mv is sub-ms; closing it fully would
need a double-mv (old/new/cleanup) which is v0.2 territory.
stream_test.go pins:
- mkdir-p of .staging happens
- tar extracts to .staging (not directly to dest)
- mv from .staging to dest exists
- tar happens BEFORE any rm of $DEST (transactional property)
- no rm of the parent /data/shared (sibling-table safety)
## Medium — Orphan scan flagged active pushes
FindOrphanStagePods's only filter was age > 5min. But pod.go itself
budgets ~8.5 min for a 1 GiB transfer — a legitimately-running
near-cap push would be mislabeled as orphan by the same customer's
next concurrent push, and the (newly-fixed) delete hint would now
correctly target that specific running Pod by name, killing it.
Fix: skip Phase=Running pods entirely. A Running stage Pod is
presumed to be doing real work; activeDeadlineSeconds is its
cluster-side safety net. Pods in non-Running phases past grace
(Pending stuck on image pull, Failed from crash) still flag —
those are the genuine orphan shapes.
Two new tests:
- TestFindOrphanStagePods_SkipsRunningPods: 30-min-old Running
Pod doesn't surface as orphan
- TestFindOrphanStagePods_FlagsNonRunningPastGrace: 30-min-old
Failed Pod DOES surface (regression guard — narrowing the
filter should reduce false positives, not eliminate the
warning entirely)
Local: vet, test -race -cover, gofmt -s, errcheck — all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round 8 of Bugbot fixes. Two new findings on commit 42c5e93, both real: ## Medium — Parallel pushes corrupted staging Two concurrent `dataset push` runs for the same `--table` shared one `$DEST.staging` path on the PVC. Each Pod's `rm -rf $DEST.staging`, `tar -xf -`, `mv` sequence raced against the other, with no coordination. Worst case: push A's `rm -rf` wipes push B's mid-extraction state, B's tar then writes into a partially-removed dir, customer ends up with an interleaved- corrupted-and-half-committed dataset. Fix: each invocation generates a fresh 8-hex-char suffix for the staging dir name (`$DEST.staging-<8hex>`), reusing pod.go's randomSuffix. Two parallel pushes now have distinct staging dirs — no interleaved-write hole. The FINAL `rm $DEST && mv $STAGING $DEST` step is still last-write-wins under concurrent commits, but that's an acceptable v0.1 semantic (concurrent pushes for the same table are inherently undefined; whichever commits last "wins" with a COHERENT dataset, not an interleaved one). New test TestStreamLayout_StagingSuffixIsUniquePerInvocation pins the contract by calling StreamLayout twice and asserting distinct staging suffixes. ## Medium — Stream skipped size-cap enforcement (TOCTOU) Pre-flight Discover checked MaxSingleFileBytes + MaxTotalBytes, but writeTarFile / writeLayoutTar streamed files later via os.Lstat → io.Copy with NO re-check. A file that grew between pre-flight and stream (legitimate dataset prep racing the push, or adversarial overwrites) would silently upload past the 500 MiB / 1 GiB advertised caps. Fix in three layers: 1. writeTarFile re-checks the single-file cap (os.Lstat size vs MaxSingleFileBytes) right before WriteHeader. A file that grew gets a clear "exceeds v0.1 single-file cap" error using the same sizeError formatter Discover uses. 2. writeTarFile now returns (int64, error) — the declared header size. writeLayoutTar accumulates this into a running total and aborts mid-stream if it exceeds MaxTotalBytes. Tests added below to pin the running-total contract. 3. io.Copy → io.CopyN(tw, f, st.Size()) caps the actual byte transfer at the declared header size. Without the cap, a file that grew between Lstat and io.Copy would overflow the tar header — header says N bytes, body has > N → tar archive corruption. CopyN-and-trust truncates instead. Closes both the metadata-side (header size) and body-side (byte stream) halves of the stream-time TOCTOU window. Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dirs
Round 9 of Bugbot fixes. Both findings real:
## Medium — Tar goroutine could deadlock on suffix error
The randomSuffix call landed AFTER the tar-write goroutine spawned.
If crypto/rand failed (rare but possible on entropy-starved systems
— same scenario the round-3 BuildStagePodSpec fix handles), the
function returned without closing the pipe or waiting on the
goroutine. The goroutine would then block once the ~64 KB pipe
buffer filled, leaking forever.
Fix: hoist randomSuffix (and the entire remoteCmd composition)
ABOVE the io.Pipe + goroutine setup. The function now bails on
suffix failure before touching the pipe at all, so there's
nothing to deadlock on.
## Medium — Failed pushes leaked .staging-<hex> dirs
Round 8's unique-per-invocation suffix fixed parallel-push
interleaving but created a new leak: if THIS push fails before
the final `mv` step, the .staging-<hex> dir lingers on the PVC.
Round 8's `rm -rf $STAGING` at the start only cleans up THIS
invocation's path, not the previous-failed one. Repeated failed
pushes accumulate unbounded storage on the shared PVC.
Fix: append a defensive cleanup pass to the remote script:
find $(dirname $DEST) -maxdepth 1 -name "$(basename $DEST).staging-*" \\
-mmin +60 -exec rm -rf {} + 2>/dev/null || true
Constraints baked into the pattern:
- -mmin +60 (1 hour) is 2x activeDeadlineSeconds — anything
older HAS to be from a dead stage Pod, so we can't race with
a parallel push's in-progress staging
- -name pattern scopes to THIS table's staging siblings only;
other tables' .staging-* dirs are none of our business
- 2>/dev/null || true — find failures don't fail the whole
stream (defensive cleanup is best-effort, the customer's
actual push already committed before this runs)
- Uses ';' (semicolon) instead of '&&' so it runs even if the
main push sequence failed somehow — orphan cleanup should
happen regardless
Updated TestStreamLayout_RemoteCommand asserts the find pattern
is present with the exact -mmin window and table-scoped -name.
Local: vet, test -race -cover, gofmt -s, errcheck — all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bugbot's tenth-round single finding: the previous r7 pattern
of `rm $DEST && mv $STAGING $DEST` had a small but real failure
window — if the rm succeeded and the mv failed (ENOSPC, fs-level
rename error, transient kernel weirdness), the customer's
previous dataset was already gone and the new data was stranded
under .staging-<hex> where the ingestor wouldn't see it.
Fix: backup-and-rollback. Rename the existing $DEST to a unique
$DEST.old-<hex> sibling FIRST; only then mv $STAGING into $DEST;
on the latter's failure, restore the backup. On success, drop
the backup.
{ [ -e $DEST ] && mv $DEST $DEST.old-<hex> || true; } &&
{ mv $STAGING $DEST ||
{ [ -e $DEST.old-<hex> ] && mv $DEST.old-<hex> $DEST; exit 1; }; } &&
rm -rf $DEST.old-<hex>
Properties this gives us:
- tar fails: $DEST untouched (transactional from r7)
- backup mv fails: $DEST untouched
- main mv fails: backup at .old-<hex> survives; customer
recovers via `kubectl exec -- mv .old-<hex> $DEST`
- final rm fails: just leaves an .old-<hex> cruft, picked up
by the r9 find -mmin +60 sweep (extended to also catch
.old-* siblings)
- first-ever push (no pre-existing $DEST): `[ -e $DEST ] && ...`
short-circuits, backup mv silently skipped via `|| true`
The .old-<hex> suffix reuses the same random as .staging-<hex>
so two parallel pushes can't collide on the backup name. Both
.staging-* AND .old-* now flow through the orphan-cleanup find
pattern (with the `-o` alternation), so r9's leak-prevention
covers both halves of the swap.
Test updates pin:
- Backup mv (DEST → .old) appears BEFORE primary mv
(.staging → DEST) — rollback contract
- Backup and staging suffixes MUST agree (rollback-target
correctness)
- find pattern includes both -name "...staging-*" and
-name "...old-*"
Local: vet, test -race -cover, gofmt -s, errcheck — all green.
This is round 10. The new-finding rate has been decaying:
5→2→2→2→1. We're approaching the asymptote.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merged
5 tasks
…t r11)
Two new findings on r10's commit, both real:
## Medium — Remote script masked failures via POSIX `;` + `|| true`
The previous shell used `&&-chain ; find ... || true`. POSIX `;`
runs the next command regardless of prior exit status, and the
trailing `|| true` then forces exit 0. A failed tar or mv earlier
in the && chain would set $? to non-zero, then `;` would clobber
it back to 0 via the find's `|| true`. Result: a remote push
that actually failed would return exit 0 to the exec subprocess,
and StreamLayout would happily report "Staged N files" on a
catastrophic failure.
Fix: rewrite the script using `set -e` + explicit newline-
separated statements + an explicit `if ! mv; then rollback;
exit 1; fi` for the swap. With set -e:
- Any non-guarded non-zero exits the script with that status.
- `cmd || true` continues to suppress (find cleanup stays
best-effort).
- `if ! cmd; then ...; fi` is treated as guarded, so set -e
doesn't preempt the explicit rollback handler.
Multi-line shell-c args work across busybox sh (alpine 3.20),
dash, and bash equally. No portability regression.
## Medium — Stream error masked the upstream tar error
After r8 added stream-time MaxSingleFileBytes / MaxTotalBytes
rechecks in writeLayoutTar, the LOCAL tar build can legitimately
fail mid-stream. When it does, the goroutine's CloseWithError
propagates to the pipe reader; exec sees broken-pipe and returns
a generic "exec stream against ns/pod failed" error.
The previous code checked streamErr FIRST and returned the
exec-flavored framing. Customer saw "streaming files failed"
instead of the actually-actionable "dataset exceeded v0.1 total
cap of 1.00 GiB after streaming ..." diagnostic from the tar
side.
Fix: swap the order. Check tarErr first — when both are non-nil,
the tar side is the upstream cause; the broken-pipe streamErr is
downstream noise. streamErr-only (no tarErr) is still the real
network/RBAC/remote-tar case and gets surfaced with the exec
wording.
Local: vet, test -race -cover, gofmt -s, errcheck — all green.
Round 11 finding rate: 2 new (1+2+5+2+4+2+2+1+2 → still trending
roughly downward, though not monotone). One more clean round
and we're at the asymptote.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 tasks
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 8bfada3. Configure here.
Independent review pass over PR #9 (post-bugbot-r11). Five findings, none caught by bugbot — all cross-cutting doc/style: 1. stream.go: dead comment block above remoteCmd composition. The block was a holdover from the && era; it claimed "&& sequencing means the swap only fires if tar succeeded" but we switched to `set -e` in r11. The canonical pipeline doc lives below the composition; collapsed the upstairs block into a single semantic- guarantees summary (HERMETIC/TRANSACTIONAL/PARALLEL-SAFE/EXIT- FAITHFUL) and removed the contradictory && reference. 2. stage.go: StageOptions docstring referenced a non-existent `OrphanLogger` field. Stale from an early design where orphan warnings were emitted via a logger callback rather than the integrated FormatOrphansWarning we ended up with. Corrected to the actual nil-able fields (Progress, Out). 3. orphan.go: OrphanGracePeriod comment claimed "5 minutes is generous: a healthy stage Pod is fully done in ~30 seconds... under a couple minutes at the 1 GiB cap." That directly contradicts pod.go's StagePodActiveDeadline budget (~8.5 min for a 1 GiB transfer at 2 MB/s, plus image pull, plus ready wait). Rewrote to reflect the post-r7 semantic: Running Pods are never flagged regardless of age; the 5-min grace targets the Pending/Failed/Unknown shapes that are genuinely stuck. 4. orphan.go: joinNames was a 9-line wrapper around `strings.Join( names, " ")`. The comment justifying it ("single point of change for tests") doesn't hold — the tests assert on the final output string, not on this helper's surface. Inlined. 5. cli/dataset.go: runDatasetPush docstring still said "the PR-a slim implementation. It performs every pre-flight check... the actual file staging is gated behind a clear 'not yet implemented' error." PR-b (this PR) actually implements the staging — the doc lagged the code. Updated to reflect the complete Phase 3 flow. No behavioral change. Tests + lint green. The doc fixes matter because future readers (humans and the next bugbot pass) get contradictory signals when the comments disagree with the code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Completes Phase 3 (tracebloc/client#151). PR-a (#8) delivered the no-op-safe pre-flight; this PR-b plugs in the actual file staging.
Flow now end-to-end:
New files
internal/push/pod.go— Pod spec builder + Create/Wait/Delete (alpine 3.20 pinned by digestsha256:d9e853e8..., PSA-restricted,activeDeadlineSeconds=600defense-in-depth)internal/push/stream.go— Executor interface (SPDYExecutor in prod, fakeExecutor in tests) + tar packagerinternal/push/progress.go— schollz/progressbar/v3 wrapper, TTY-detected (no-op in CI)internal/push/orphan.go— find + warn about stale stage Pods (warn-only in v0.1; auto-cleanup is v0.2)internal/push/stage.go— orchestrator with SIGINT-safe cleanup via fresh-context deferMedium issues from PR-a self-review — addressed
tabular_classification) now hit a CLI-side gate BEFORE schema validation. Customers seecategory "tabular_classification" is not supported in v0.1 (only image_classification)instead of the confusingmissing property 'schema'from the schema validator.--ingestor-satest coverage:TestStage_IngestorSANameFlowsToPodpins that the override actually flows to the stage Pod'sServiceAccountName.Tricky bits worth a closer review
pipe.Writeforever — fixed by closing the pipe Reader afterexec.Execreturns. Caught byTestStage_CancelledContext_StillCleansUp.DeleteStagePodusescontext.WithTimeout(context.Background(), 30s)so cleanup runs even when the parent ctx is cancelled (SIGINT). Without this, every Ctrl-C leaks an orphan Pod.writeLayoutTarpropagates the tar trailer-write error if-and-only-if the function otherwise succeeded — silent truncation would be much worse than a noisy error.Exit codes (updated)
Test plan
go vet ./...— greengo test -race -cover ./...— green (push 79.8%, cluster 83.2%, schema 80.7%, cli 52.2%)gofmt -s -l .— no drifterrcheck ./...— greentracebloc/clientinstall — staging cats_dogs sampleAfter this PR merges, only Phase 4 (
tracebloc/client#152— submit-to-jobs-manager + watch + summary) and Phase 5 (#153 — release distribution) remain for v0.1.Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com
Note
Medium Risk
Adds new Kubernetes Pod lifecycle +
execstreaming logic and changes CLI signal/exit-code behavior, which can impact cluster resources and data staging if regressions occur. Risk is mitigated by extensive unit tests and restrictive Pod security context, but it still touches critical transfer/cleanup paths.Overview
Completes Phase 3
dataset pushby replacing the previous “pre-flight only” behavior with actual file staging to the shared PVC: it now creates an ephemeral, PSA-restricted stage Pod, waits for readiness with better diagnostics, streamslabels.csv+images/*via tar-over-exec(SPDY) with an optional TTY progress bar, and always cleans up the Pod (including on Ctrl-C).Adds safeguards and UX improvements: a v0.1-only category gate, stricter table-name validation (including max length), best-effort orphan stage-Pod detection with actionable warnings, symlink rejection in local layout discovery, a new
--stage-pod-imageoverride for air-gapped environments, and updated exit codes/docs. Also wires SIGINT/SIGTERM into Cobra’s context so long-running staging cancels cleanly instead of skipping defers.Reviewed by Cursor Bugbot for commit 93a9ba2. Bugbot is set up for automated code reviews on this repo. Configure here.