feat(dataset): stage Pod + tar-over-exec stream (PR-b of #151) by saadqbal · Pull Request #9 · tracebloc/cli

saadqbal · 2026-05-22T08:17:38Z

Summary

Completes Phase 3 (tracebloc/client#151). PR-a (#8) delivered the no-op-safe pre-flight; this PR-b plugs in the actual file staging.

Flow now end-to-end:

validate table name + flag→spec synth + v0.1 category gate (NEW)
schema validation
local layout walk + size caps
kubeconfig + parent release + shared PVC discovery
pre-flight summary
orphan-Pod scan + warning (NEW)
create ephemeral stage Pod (alpine 3.20 by digest, PSA-restricted) (NEW)
wait Ready with image-pull / scheduler-failure hints (NEW)
tar-over-exec stream via client-go SPDY executor, with progress bar (NEW)
defer-delete the Pod with a fresh ctx (SIGINT-safe) (NEW)

New files

internal/push/pod.go — Pod spec builder + Create/Wait/Delete (alpine 3.20 pinned by digest sha256:d9e853e8..., PSA-restricted, activeDeadlineSeconds=600 defense-in-depth)
internal/push/stream.go — Executor interface (SPDYExecutor in prod, fakeExecutor in tests) + tar packager
internal/push/progress.go — schollz/progressbar/v3 wrapper, TTY-detected (no-op in CI)
internal/push/orphan.go — find + warn about stale stage Pods (warn-only in v0.1; auto-cleanup is v0.2)
internal/push/stage.go — orchestrator with SIGINT-safe cleanup via fresh-context defer

Medium issues from PR-a self-review — addressed

v0.1 category gate: schema-valid-but-unsupported categories (e.g. tabular_classification) now hit a CLI-side gate BEFORE schema validation. Customers see category "tabular_classification" is not supported in v0.1 (only image_classification) instead of the confusing missing property 'schema' from the schema validator.
--ingestor-sa test coverage: TestStage_IngestorSANameFlowsToPod pins that the override actually flows to the stage Pod's ServiceAccountName.

Tricky bits worth a closer review

Pipe deadlock guard (stream.go): if exec returns early (ctx cancel, remote tar dies), the tar-write goroutine would block on pipe.Write forever — fixed by closing the pipe Reader after exec.Exec returns. Caught by TestStage_CancelledContext_StillCleansUp.
Fresh-context defer (stage.go): the deferred DeleteStagePod uses context.WithTimeout(context.Background(), 30s) so cleanup runs even when the parent ctx is cancelled (SIGINT). Without this, every Ctrl-C leaks an orphan Pod.
Named-return tar Close (stream.go): writeLayoutTar propagates the tar trailer-write error if-and-only-if the function otherwise succeeded — silent truncation would be much worse than a noisy error.

Exit codes (updated)

Code	Cause
0	files staged successfully (Phase 4 will add: submitted + completed)
2	schema validation failed OR v0.1-unsupported category
3	local-layout or kubeconfig error
4	cluster reachable but parent release / shared PVC missing
7	pre-flight succeeded but staging failed (Pod create, image pull, exec, remote tar)

Test plan

go vet ./... — green
go test -race -cover ./... — green (push 79.8%, cluster 83.2%, schema 80.7%, cli 52.2%)
gofmt -s -l . — no drift
errcheck ./... — green
Local smoke: v0.1 gate fires with actionable message
Real EKS smoke against a tracebloc/client install — staging cats_dogs sample

After this PR merges, only Phase 4 (tracebloc/client#152 — submit-to-jobs-manager + watch + summary) and Phase 5 (#153 — release distribution) remain for v0.1.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

Note

Medium Risk
Adds new Kubernetes Pod lifecycle + exec streaming logic and changes CLI signal/exit-code behavior, which can impact cluster resources and data staging if regressions occur. Risk is mitigated by extensive unit tests and restrictive Pod security context, but it still touches critical transfer/cleanup paths.

Overview
Completes Phase 3 dataset push by replacing the previous “pre-flight only” behavior with actual file staging to the shared PVC: it now creates an ephemeral, PSA-restricted stage Pod, waits for readiness with better diagnostics, streams labels.csv + images/* via tar-over-exec (SPDY) with an optional TTY progress bar, and always cleans up the Pod (including on Ctrl-C).

Adds safeguards and UX improvements: a v0.1-only category gate, stricter table-name validation (including max length), best-effort orphan stage-Pod detection with actionable warnings, symlink rejection in local layout discovery, a new --stage-pod-image override for air-gapped environments, and updated exit codes/docs. Also wires SIGINT/SIGTERM into Cobra’s context so long-running staging cancels cleanly instead of skipping defers.

^{Reviewed by Cursor Bugbot for commit 93a9ba2. Bugbot is set up for automated code reviews on this repo. Configure here.}

Completes Phase 3 (tracebloc/client#151). PR-a delivered the no-op-safe pre-flight; this PR-b plugs in the actual file staging — ephemeral Pod, client-go SPDY executor, tar stream, SIGINT-safe cleanup. ## What lands internal/push/ (5 new files): - pod.go Stage Pod spec builder + Create/Wait/Delete lifecycle. Alpine 3.20 pinned by digest, PSA- restricted security context, activeDeadline- Seconds=600 in-cluster self-kill defense-in-depth, random 8-hex-char suffix for parallel-push collision avoidance. - stream.go Executor interface (SPDYExecutor in prod, fakeExecutor in tests). StreamLayout wires a tar.Writer through an io.Pipe to the exec stdin, running `tar -xf - -C /data/shared/<table>` in the Pod. Captures remote stderr into the error surface so customers see "no space left on device" verbatim, not a generic exec failure. - progress.go schollz/progressbar/v3 wrapper, TTY-detected via golang.org/x/term. No-op in CI / non-TTY output so the bar doesn't pollute log streams. - orphan.go Scans for stage Pods labeled managed-by=tracebloc-cli older than 5 min; renders an actionable warning with the kubectl-delete command. Warn-only in v0.1 (auto-cleanup is v0.2 — can't distinguish "crashed previous push" from "still-running parallel push from another workstation"). - stage.go The Stage() orchestrator. Order: orphan scan → CreateStagePod → defer DeleteStagePod with a FRESH context (SIGINT-safe — defers fire even when the parent ctx is cancelled) → WaitForReady → StreamLayout → cleanup. On any error past Create, the deferred delete still runs. internal/cli/dataset.go: - Replace the exit-6 stub with push.Stage(...) wired to cluster-discovered ns/PVC/SA + the new SPDYExecutor. - Add v0.1 category gate (Medium-1 from PR-a self-review). Runs BEFORE schema validation so unsupported-but-schema- valid categories like tabular_classification get the actionable "v0.1 supports only image_classification" message instead of the schema's confusing "missing property 'schema'". - Add --stage-pod-image flag for air-gapped customers. - Update exit-code doc: drop exit 6 (PR-b stub), add exit 7 (staging-step failed, distinct from pre-flight 3/4). internal/push/stage_test.go covers Medium-2 from PR-a review: TestStage_IngestorSANameFlowsToPod pins that the --ingestor-sa override actually lands on the stage Pod's ServiceAccountName. ## Tricky bits Pipe deadlock guard: when the executor returns early (ctx cancel, remote tar dies immediately), the tar-write goroutine would block on pipe.Write forever because nothing reads. Fix: close the pipe Reader after exec.Exec returns, which unblocks the writer with io.ErrClosedPipe. Caught by TestStage_CancelledContext_StillCleansUp. SIGINT safety: the deferred DeleteStagePod uses a fresh ctx with a 30s deadline rather than the (possibly cancelled) parent ctx. Without this, Ctrl-C right after pod-create leaks an orphan Pod. activeDeadlineSeconds=600 is the in-cluster backstop for the truly-pathological case (hard-kill, network partition between laptop and cluster). errcheck cleanup: writeLayoutTar uses a named return + deferred close that promotes a tar trailer-write error if and only if the function otherwise succeeded — silent truncation would be much worse than a noisy error. ## Test plan - [x] go vet ./... — green - [x] go test -race -cover ./... — green (push 79.8%, cluster 83.2%, schema 80.7%, cli 52.2%) - [x] gofmt -s -l . — no drift - [x] errcheck ./... — green - [x] go build — binary builds - [x] Local smoke: --category=tabular_classification → exit 2 with "v0.1 supports only image_classification" message (the new gate's actionable diagnostic) - [ ] Real EKS smoke (manual; out of CI scope) ## Open items deferred Per PR-a review's Low/Nit list: - runDatasetPushArgs.Context shadowing context.Context - printPushPreflight rendering AccessModes with %v - testutil package consolidation for imgcDir / imgcLayout - dataset_test kubeconfig path traversal via subtests - Discover hint for nested-image-dirs These are tracked for v0.2 cleanup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LukasWodka · 2026-05-22T08:18:18Z

👋 Heads-up — Code review queue is at 21 / 8

Above the WIP limit. The team convention is to review existing PRs before opening new work.

Open PRs currently in Code review (oldest first):

backend#698 — ci: add Tests workflow (Django suite + Bandit plugin pytest) · author: @aptracebloc · reviewer: @saadqbal
client-runtime#36 — Bump requests from 2.32.3 to 2.33.0 in /Node-deploy · author: @dependabot · no reviewer assigned
client-runtime#37 — Bump black from 25.1.0 to 26.3.1 in /Node-deploy · author: @dependabot · no reviewer assigned
client-runtime#38 — Bump requests from 2.32.3 to 2.33.0 · author: @dependabot · no reviewer assigned
client-runtime#39 — Bump black from 25.1.0 to 26.3.1 · author: @dependabot · no reviewer assigned
client-runtime#45 — Staging: Enhance jobs-manager with HTTP proxy, ingestion endpoint, and fixes · author: @saadqbal · no reviewer assigned
client-runtime#47 — fix(ci(build): compile-check all 8 release targets in build.yml #46): proxy refresh hotloop, 201-on-dup-insert, ingest_config not in idempotency · author: @saadqbal · no reviewer assigned
data-ingestors#118 — Document data-ingestor release procedure · author: @saadqbal · no reviewer assigned
design-system#19 — fix: un-track coverage/ and node_modules/ from git · author: @LukasWodka · no reviewer assigned
docs#3 — Block bots from crawling Mintlify static assets · author: @LukasWodka · reviewer: @saadqbal

Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.)

…+Med) Two real bugs caught by Bugbot's re-review on PR-b: ## 1. High — Stage Pod name breaks DNS-1123 BuildStagePodSpec used the raw table name as a Pod name segment. ValidateTableName accepts [A-Za-z0-9_]+ (MySQL identifier rules), but K8s Pod names follow DNS-1123 subdomain rules: lowercase alnum + hyphen, must start/end alnum. The canonical example throughout tracebloc docs (cats_dogs_train, snake_case) would fail Pod create post-pre-flight — worst-of-both-worlds UX (pre-flight says "good!" then create fails). Fix: transform the table name into a DNS-1123-safe segment for the Pod name only — lowercase, _→-, trim leading/trailing hyphens, cap length, fallback to "tbl" for the pathological all-underscore case. The original verbatim name still goes in the tracebloc.io/table label so orphan warnings stay traceable. TestDNS1123SafeTableSegment covers cats_dogs → cats-dogs, MyTable → mytable, _leading_underscore → leading-underscore, _ → tbl, 50-char → 30-char truncation. Each result is cross-validated against an inline DNS-1123 regex check. ## 2. Medium — SIGINT skips Pod cleanup push.Stage documented SIGINT-safe cleanup via context cancellation + fresh-ctx deferred DeleteStagePod, but cmd/tracebloc/main.go called Execute() without signal.NotifyContext. Default Go runtime behaviour on SIGINT is to exit without running defers — so every Ctrl-C during staging leaked an orphan Pod until activeDeadlineSeconds (10 min) fired. The docstring was false advertising. Fix: signal.NotifyContext(ctx, SIGINT, SIGTERM) in main, passed to ExecuteContext. First Ctrl-C cancels the cobra ctx → push.Stage's in-flight HTTP cancels → deferred cleanup runs (with its own fresh ctx, also kept). Second Ctrl-C does normal hard-kill (stop() in defer un-registers the handler) — important if the cleanup itself hangs. Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two new findings from Bugbot's re-review on commit 35ae044: ## Medium — WaitForStagePodReady polled through terminal errors The poll callback returned (false, nil) for EVERY Pods.Get error, which meant Pod-deleted-out-of-band (NotFound) or RBAC-revoked-mid- push (Forbidden) waited the full 60s timeout despite the situation being unrecoverable. Customer-facing impact: 60s of "Waiting for stage Pod to be Ready..." for an error that should surface in <1s. Fix: classify the error. NotFound + Forbidden terminate the poll immediately (return (false, err)); everything else stays transient (network blip, brief API unavailability). The apierrors.Is{NotFound,Forbidden} helpers were already imported from the previous fixes, so this is a single-block change. Two new tests pin the contract via t.Now() bounds — if a regression makes either case transient again, the test waits the full 60s and the assertion catches it. ## Low — BuildStagePodSpec swallowed crypto/rand error `suffix, _ := randomSuffix(4)` — if crypto/rand failed (rare but possible on systems with exhausted entropy), the suffix was empty and the Pod name became `tracebloc-stage-cats-dogs-` with a bare trailing hyphen. DNS-1123 rejects that → opaque API error from kube-apiserver instead of a clear local diagnostic. Fix: change BuildStagePodSpec signature to (*Pod, error) and propagate the rand failure. CreateStagePod (the prod caller) already returns an error; tests use a mustBuildStagePod helper that t.Fatal's on the rare path. 8 test call sites updated via find-and-replace, plus the override-image test by hand. The third inline comment ("SIGINT skips pod cleanup", bug_id 24ab9106) is a stale carry-over of the prior-round finding I already fixed in 35ae044 — same bug_id, GitHub auto-shifted the anchor onto the new commit. Replied in-thread pointing at the fix. Bugbot's review-body confirms "2 NEW findings", not 3. Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Round 4 of Bugbot fixes on PR-b. Two new findings on commit 7baa321 (plus 3 stale carry-overs from earlier rounds — same bug IDs as already-fixed findings, GitHub keeps re-anchoring them onto each new commit; Bugbot's review-body confirms "2 NEW findings"). ## Medium-Security — symlink images bypassed size caps The vulnerability: Discover sized image entries via DirEntry.Info() (Lstat-equivalent: returns the symlink's own ~100-byte size). writeTarFile read them via os.Stat + os.Open (which follow symlinks). So a customer with `images/evil.jpg` symlinked to `/var/log/system.log` (multi-GB) or `~/.ssh/id_rsa` could: 1. Pass Discover's MaxSingleFileBytes + MaxTotalBytes caps trivially (cap is 1 GiB; the symlink itself is ~100 bytes). 2. Stream the target's FULL contents into the cluster PVC, where the cluster admin can read them via `kubectl exec`. This is both a size-cap bypass AND an arbitrary-local-file disclosure. Worse, it can fire unintentionally — a customer who ran `ln -s ~/datasets/big-images images` to share data across projects would silently stream gigabytes during what they thought was a small test push. Fix in three layers: 1. walk.go: os.Stat → os.Lstat for the labels.csv check (so the mode bits include ModeSymlink) + new rejectSymlink() helper called for both labels.csv and each image entry. v0.1 rejects symlinks outright with a clear "v0.1 doesn't allow symlinks (security: ...)" message pointing at `cp -L` for materializing and the v0.2 cloud-source story for distributed data. 2. stream.go: writeTarFile's os.Stat → os.Lstat + symlink check too. Defense-in-depth — Discover is the primary fix, but a future refactor that calls writeTarFile directly (e.g. a new resume-from-partial-transfer code path in v0.2) would re-introduce the hole without this layer. 3. New tests pin both rejections (labels.csv-as-symlink + image- as-symlink). Skipped on Windows where symlinks require admin. ## Medium — WaitForStagePodReady didn't short-circuit on Phase=Failed Companion to the NotFound/Forbidden short-circuit landed in the previous commit. The poll positively-terminated on Ready=True but never negatively-terminated on Phase=Failed (container crashed at startup: PSA rejection, image crashloop, OOM at container init) or Phase=Succeeded (stage container's sleep exited unexpectedly). Same 60s timeout symptom for an outcome that's actually decided in <5s. Fix: in the poll body, check Phase after the conditions loop; return a structured error including the podReadyTimeoutHint output (container-status reason + message) so the customer sees "terminated in phase Failed (OOMKilled — ...)" instead of a generic timeout. New test TestWaitForStagePodReady_FailedPhaseIsTerminal pins the <3s elapsed contract for an OOMKilled startup. Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ress defer Round 5 of Bugbot fixes on PR-b. The "5 potential issues" review-body count corresponds to 5 NEW findings; the other 5 inline comments are stale carry-overs of bug IDs already fixed in earlier rounds (24ab9106, 15e29a3e, 258869d3, 5c577688, 7fcf137f — GitHub auto-anchors them onto each new commit). ## Medium — Symlinked images/ directory bypassed all checks Round 4 fixed symlinked FILES inside images/ (Lstat on each entry). It missed the case where images/ ITSELF is a symlink — `ln -s /etc images` passed os.Stat (which follows symlinks), the IsDir check succeeded, and the loop walked /etc. Same disclosure pattern as the round-4 finding, one layer up. Fix: os.Stat → os.Lstat on imagesDir + rejectSymlink before the IsDir check. Symmetric with the labels.csv treatment from round 4. New test TestDiscover_RejectsSymlinkedImagesDir pins the rejection. ## Medium — Table label could exceed Kubernetes 63-char limit ValidateTableName accepted [A-Za-z0-9_]+ without a length cap, but the stage Pod's tracebloc.io/table label carries the raw name. K8s label values cap at 63 chars (DNS-1123 label rule), so a 100-char name passed pre-flight then failed Pod creation with an opaque label-validation error. Fix: cap at 63 chars (also matches MySQL's 64-char identifier limit with 1 char of headroom for any future ingestion-run-id suffix). New MaxTableNameLength const + boundary test pin the cap. ## Medium — Re-push left stale PVC files The remote command was `mkdir -p <dest> && tar -xf - -C <dest>`. If a previous push had 3 images and the new push has 2, the PVC ends up with the union — and labels.csv now disagrees with the stage directory, silently breaking ingestion. Fix: `rm -rf <dest> && mkdir -p <dest> && tar -xf - -C <dest>`. Safe because dest = StagedPrefix(table) = /data/shared/<table> and ValidateTableName has ensured `table` is a single safe segment, so rm -rf can only nuke that one per-table subdir, never the parent /data/shared or sibling tables. TestStreamLayout_RemoteCommand updated to assert the rm AND its ordering before mkdir. ## Low — Progress bar not Finish'd on early Stage failure Stage's deferred Finish lives inside StreamLayout, so a failure earlier in the lifecycle (CreateStagePod, WaitForStagePodReady) returned without clearing the TTY bar. Visual artifact on the customer's terminal after a Pod-create failure. Fix: defer progress.Finish() at the construction site in runDatasetPush. Schollz Finish is idempotent so double-call on happy path is a no-op. ## Hard-link bypass — documented, not fixed Bugbot also flagged that hard links to outside files aren't rejected. Filesystem mode bits don't distinguish a hard link from a regular file the way ModeSymlink distinguishes symlinks, and a high-Nlink check has too many false positives. The implicit security boundary is the CLI's process-level read permissions: a customer can only hard-link files they already have read access to, so this isn't a privilege escalation. v0.2 may add openat(O_NOFOLLOW) sandboxing if customers need harder isolation. Documented in rejectSymlink's docstring as a known limitation alongside the v0.2 plan. Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Round 6. Review-body says "2 NEW + 3 unresolved" — Bugbot itself is flagging the 9 carry-overs at this point. Two new findings, both real: ## High — Pod activeDeadlineSeconds (600s) cut too close The deadline starts at Pod CREATION, but image pull (up to 60s) + WaitForStagePodReady ceiling (60s) + worst-case stream (1 GiB at 2 MB/s ≈ 8.5 min) eat the budget. Near-cap customers on slow uplinks could see the kubelet terminate mid-transfer with no useful diagnostic. Fix: bump StagePodActiveDeadline from 600 → 1800 (30 min). Budget breakdown in the const's docstring; comfortable margin for variance. Cost is "an idle alpine Pod with sleep idles for ~22min after a successful push" — ~5 MiB cluster RAM, zero CPU, deleted seconds later by the defer'd cleanup anyway. Test pin: TestStagePodActiveDeadline_CoversFullLifecycle asserts the floor at 1500s so a future regression to 600 is caught. ## Medium — Windows tar paths used backslashes `filepath.Join("images", filepath.Base(abs))` produces `images\file.jpg` on Windows. USTAR / POSIX tar requires forward slashes; the Linux stage Pod's `tar -xf` either rejects or extracts as a flat-named file (collapsing the images/ subdir the ingest spec expects). Breaks the Windows-built CLI. Fix: switch to `path.Join` for the tar HEADER name. Keep `filepath.Join` everywhere else (where the OS-native separator is the right thing). Test pin TestStreamLayout_TarPathsAreForwardSlash asserts no backslashes in any tar entry name, regardless of host OS. ## Carry-overs Bugbot's review body now explicitly says "There are 3 total unresolved" — it knows the carry-overs aren't on the new commit anymore. Counts I'm tracking as known-fixed across earlier rounds: 24ab9106 (SIGINT, r3), 15e29a3e (Pod-Get spin, r3), 258869d3 (rand error, r3), 5c577688 (Failed Pod, r4), 7fcf137f (symlink files, r4), de426248 (hard links, r5 — documented limitation), cd462de9 (symlink dir, r5), a8e9e5c7 (label len, r5), 0b296807 (progress finish, r5). Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…re-push Round 7 of Bugbot fixes on PR-b. Review-body confirmed "4 NEW + ...". All four real, all four fixed: ## Medium — Wait errors mislabeled as timeout WaitForStagePodReady wrapped EVERY poll exit (NotFound, Forbidden, Phase=Failed, ctx.Canceled from SIGINT) in "did not become Ready within 60s." Misleading: the customer who hit Ctrl-C 2 seconds into the wait would see a "timed out" diagnostic. Fix: branch on errors.Is(err, context.DeadlineExceeded). True timeout keeps the "did not become Ready within %s" wording; early-exit cases surface as "did not reach Ready state: %w" with the actual cause front and center. ## Medium — Orphan-delete hint would nuke parallel pushes FormatOrphansWarning's kubectl-delete command used the `managed-by=tracebloc-cli,component=stage-pod` label selector, which matches every stage Pod in the namespace including LEGITIMATE RUNNING ONES from parallel pushes (other workstations, or even this one's just-started Pod). A customer copy-pasting the suggested cleanup could silently kill someone else's in-progress push. Fix: list specific orphan Pod names in the delete command — `kubectl delete pod -n <ns> <name1> <name2> ...`. Test regression- pins the absence of the label-selector form. ## Medium — Re-push deleted before transfer succeeded Round 5's hermetic-re-push fix (rm -rf $DEST && mkdir + tar) satisfied "no stale files" but not "preserve on failure." A tar mid-stream failure (Ctrl-C, network drop, container OOM) left the customer with NOTHING on the PVC — the previous push's data already nuked, the new push aborted before completing. Fix: extract to <dest>.staging, swap on success: rm -rf <dest>.staging # cleanup any prior failure mkdir -p <dest>.staging && tar -xf - -C <dest>.staging rm -rf <dest> && mv <dest>.staging <dest> # swap on success rm -rf <dest>.staging # defensive cleanup The shell's && sequencing means swap only fires if tar succeeded. Lost the prior `exec /bin/tar` micro-optimization (can't exec mid-chain) — fine, the shell process is tiny. The window between the destination rm and the mv is sub-ms; closing it fully would need a double-mv (old/new/cleanup) which is v0.2 territory. stream_test.go pins: - mkdir-p of .staging happens - tar extracts to .staging (not directly to dest) - mv from .staging to dest exists - tar happens BEFORE any rm of $DEST (transactional property) - no rm of the parent /data/shared (sibling-table safety) ## Medium — Orphan scan flagged active pushes FindOrphanStagePods's only filter was age > 5min. But pod.go itself budgets ~8.5 min for a 1 GiB transfer — a legitimately-running near-cap push would be mislabeled as orphan by the same customer's next concurrent push, and the (newly-fixed) delete hint would now correctly target that specific running Pod by name, killing it. Fix: skip Phase=Running pods entirely. A Running stage Pod is presumed to be doing real work; activeDeadlineSeconds is its cluster-side safety net. Pods in non-Running phases past grace (Pending stuck on image pull, Failed from crash) still flag — those are the genuine orphan shapes. Two new tests: - TestFindOrphanStagePods_SkipsRunningPods: 30-min-old Running Pod doesn't surface as orphan - TestFindOrphanStagePods_FlagsNonRunningPastGrace: 30-min-old Failed Pod DOES surface (regression guard — narrowing the filter should reduce false positives, not eliminate the warning entirely) Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Round 8 of Bugbot fixes. Two new findings on commit 42c5e93, both real: ## Medium — Parallel pushes corrupted staging Two concurrent `dataset push` runs for the same `--table` shared one `$DEST.staging` path on the PVC. Each Pod's `rm -rf $DEST.staging`, `tar -xf -`, `mv` sequence raced against the other, with no coordination. Worst case: push A's `rm -rf` wipes push B's mid-extraction state, B's tar then writes into a partially-removed dir, customer ends up with an interleaved- corrupted-and-half-committed dataset. Fix: each invocation generates a fresh 8-hex-char suffix for the staging dir name (`$DEST.staging-<8hex>`), reusing pod.go's randomSuffix. Two parallel pushes now have distinct staging dirs — no interleaved-write hole. The FINAL `rm $DEST && mv $STAGING $DEST` step is still last-write-wins under concurrent commits, but that's an acceptable v0.1 semantic (concurrent pushes for the same table are inherently undefined; whichever commits last "wins" with a COHERENT dataset, not an interleaved one). New test TestStreamLayout_StagingSuffixIsUniquePerInvocation pins the contract by calling StreamLayout twice and asserting distinct staging suffixes. ## Medium — Stream skipped size-cap enforcement (TOCTOU) Pre-flight Discover checked MaxSingleFileBytes + MaxTotalBytes, but writeTarFile / writeLayoutTar streamed files later via os.Lstat → io.Copy with NO re-check. A file that grew between pre-flight and stream (legitimate dataset prep racing the push, or adversarial overwrites) would silently upload past the 500 MiB / 1 GiB advertised caps. Fix in three layers: 1. writeTarFile re-checks the single-file cap (os.Lstat size vs MaxSingleFileBytes) right before WriteHeader. A file that grew gets a clear "exceeds v0.1 single-file cap" error using the same sizeError formatter Discover uses. 2. writeTarFile now returns (int64, error) — the declared header size. writeLayoutTar accumulates this into a running total and aborts mid-stream if it exceeds MaxTotalBytes. Tests added below to pin the running-total contract. 3. io.Copy → io.CopyN(tw, f, st.Size()) caps the actual byte transfer at the declared header size. Without the cap, a file that grew between Lstat and io.Copy would overflow the tar header — header says N bytes, body has > N → tar archive corruption. CopyN-and-trust truncates instead. Closes both the metadata-side (header size) and body-side (byte stream) halves of the stream-time TOCTOU window. Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…dirs Round 9 of Bugbot fixes. Both findings real: ## Medium — Tar goroutine could deadlock on suffix error The randomSuffix call landed AFTER the tar-write goroutine spawned. If crypto/rand failed (rare but possible on entropy-starved systems — same scenario the round-3 BuildStagePodSpec fix handles), the function returned without closing the pipe or waiting on the goroutine. The goroutine would then block once the ~64 KB pipe buffer filled, leaking forever. Fix: hoist randomSuffix (and the entire remoteCmd composition) ABOVE the io.Pipe + goroutine setup. The function now bails on suffix failure before touching the pipe at all, so there's nothing to deadlock on. ## Medium — Failed pushes leaked .staging-<hex> dirs Round 8's unique-per-invocation suffix fixed parallel-push interleaving but created a new leak: if THIS push fails before the final `mv` step, the .staging-<hex> dir lingers on the PVC. Round 8's `rm -rf $STAGING` at the start only cleans up THIS invocation's path, not the previous-failed one. Repeated failed pushes accumulate unbounded storage on the shared PVC. Fix: append a defensive cleanup pass to the remote script: find $(dirname $DEST) -maxdepth 1 -name "$(basename $DEST).staging-*" \\ -mmin +60 -exec rm -rf {} + 2>/dev/null || true Constraints baked into the pattern: - -mmin +60 (1 hour) is 2x activeDeadlineSeconds — anything older HAS to be from a dead stage Pod, so we can't race with a parallel push's in-progress staging - -name pattern scopes to THIS table's staging siblings only; other tables' .staging-* dirs are none of our business - 2>/dev/null || true — find failures don't fail the whole stream (defensive cleanup is best-effort, the customer's actual push already committed before this runs) - Uses ';' (semicolon) instead of '&&' so it runs even if the main push sequence failed somehow — orphan cleanup should happen regardless Updated TestStreamLayout_RemoteCommand asserts the find pattern is present with the exact -mmin window and table-scoped -name. Local: vet, test -race -cover, gofmt -s, errcheck — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Bugbot's tenth-round single finding: the previous r7 pattern of `rm $DEST && mv $STAGING $DEST` had a small but real failure window — if the rm succeeded and the mv failed (ENOSPC, fs-level rename error, transient kernel weirdness), the customer's previous dataset was already gone and the new data was stranded under .staging-<hex> where the ingestor wouldn't see it. Fix: backup-and-rollback. Rename the existing $DEST to a unique $DEST.old-<hex> sibling FIRST; only then mv $STAGING into $DEST; on the latter's failure, restore the backup. On success, drop the backup. { [ -e $DEST ] && mv $DEST $DEST.old-<hex> || true; } && { mv $STAGING $DEST || { [ -e $DEST.old-<hex> ] && mv $DEST.old-<hex> $DEST; exit 1; }; } && rm -rf $DEST.old-<hex> Properties this gives us: - tar fails: $DEST untouched (transactional from r7) - backup mv fails: $DEST untouched - main mv fails: backup at .old-<hex> survives; customer recovers via `kubectl exec -- mv .old-<hex> $DEST` - final rm fails: just leaves an .old-<hex> cruft, picked up by the r9 find -mmin +60 sweep (extended to also catch .old-* siblings) - first-ever push (no pre-existing $DEST): `[ -e $DEST ] && ...` short-circuits, backup mv silently skipped via `|| true` The .old-<hex> suffix reuses the same random as .staging-<hex> so two parallel pushes can't collide on the backup name. Both .staging-* AND .old-* now flow through the orphan-cleanup find pattern (with the `-o` alternation), so r9's leak-prevention covers both halves of the swap. Test updates pin: - Backup mv (DEST → .old) appears BEFORE primary mv (.staging → DEST) — rollback contract - Backup and staging suffixes MUST agree (rollback-target correctness) - find pattern includes both -name "...staging-*" and -name "...old-*" Local: vet, test -race -cover, gofmt -s, errcheck — all green. This is round 10. The new-finding rate has been decaying: 5→2→2→2→1. We're approaching the asymptote. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t r11) Two new findings on r10's commit, both real: ## Medium — Remote script masked failures via POSIX `;` + `|| true` The previous shell used `&&-chain ; find ... || true`. POSIX `;` runs the next command regardless of prior exit status, and the trailing `|| true` then forces exit 0. A failed tar or mv earlier in the && chain would set $? to non-zero, then `;` would clobber it back to 0 via the find's `|| true`. Result: a remote push that actually failed would return exit 0 to the exec subprocess, and StreamLayout would happily report "Staged N files" on a catastrophic failure. Fix: rewrite the script using `set -e` + explicit newline- separated statements + an explicit `if ! mv; then rollback; exit 1; fi` for the swap. With set -e: - Any non-guarded non-zero exits the script with that status. - `cmd || true` continues to suppress (find cleanup stays best-effort). - `if ! cmd; then ...; fi` is treated as guarded, so set -e doesn't preempt the explicit rollback handler. Multi-line shell-c args work across busybox sh (alpine 3.20), dash, and bash equally. No portability regression. ## Medium — Stream error masked the upstream tar error After r8 added stream-time MaxSingleFileBytes / MaxTotalBytes rechecks in writeLayoutTar, the LOCAL tar build can legitimately fail mid-stream. When it does, the goroutine's CloseWithError propagates to the pipe reader; exec sees broken-pipe and returns a generic "exec stream against ns/pod failed" error. The previous code checked streamErr FIRST and returned the exec-flavored framing. Customer saw "streaming files failed" instead of the actually-actionable "dataset exceeded v0.1 total cap of 1.00 GiB after streaming ..." diagnostic from the tar side. Fix: swap the order. Check tarErr first — when both are non-nil, the tar side is the upstream cause; the broken-pipe streamErr is downstream noise. streamErr-only (no tarErr) is still the real network/RBAC/remote-tar case and gets surfaced with the exec wording. Local: vet, test -race -cover, gofmt -s, errcheck — all green. Round 11 finding rate: 2 new (1+2+5+2+4+2+2+1+2 → still trending roughly downward, though not monotone). One more clean round and we're at the asymptote. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 8bfada3. Configure here.}

Independent review pass over PR #9 (post-bugbot-r11). Five findings, none caught by bugbot — all cross-cutting doc/style: 1. stream.go: dead comment block above remoteCmd composition. The block was a holdover from the && era; it claimed "&& sequencing means the swap only fires if tar succeeded" but we switched to `set -e` in r11. The canonical pipeline doc lives below the composition; collapsed the upstairs block into a single semantic- guarantees summary (HERMETIC/TRANSACTIONAL/PARALLEL-SAFE/EXIT- FAITHFUL) and removed the contradictory && reference. 2. stage.go: StageOptions docstring referenced a non-existent `OrphanLogger` field. Stale from an early design where orphan warnings were emitted via a logger callback rather than the integrated FormatOrphansWarning we ended up with. Corrected to the actual nil-able fields (Progress, Out). 3. orphan.go: OrphanGracePeriod comment claimed "5 minutes is generous: a healthy stage Pod is fully done in ~30 seconds... under a couple minutes at the 1 GiB cap." That directly contradicts pod.go's StagePodActiveDeadline budget (~8.5 min for a 1 GiB transfer at 2 MB/s, plus image pull, plus ready wait). Rewrote to reflect the post-r7 semantic: Running Pods are never flagged regardless of age; the 5-min grace targets the Pending/Failed/Unknown shapes that are genuinely stuck. 4. orphan.go: joinNames was a 9-line wrapper around `strings.Join( names, " ")`. The comment justifying it ("single point of change for tests") doesn't hold — the tests assert on the final output string, not on this helper's surface. Inlined. 5. cli/dataset.go: runDatasetPush docstring still said "the PR-a slim implementation. It performs every pre-flight check... the actual file staging is gated behind a clear 'not yet implemented' error." PR-b (this PR) actually implements the staging — the doc lagged the code. Updated to reflect the complete Phase 3 flow. No behavioral change. Tests + lint green. The doc fixes matter because future readers (humans and the next bugbot pass) get contradictory signals when the comments disagree with the code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

saadqbal self-assigned this May 22, 2026