Skip to content

feat(#30): dataset rm — in-cluster teardown (table + PVC) [DRAFT]#40

Merged
aptracebloc merged 1 commit into
developfrom
feat/30-dataset-rm
Jun 3, 2026
Merged

feat(#30): dataset rm — in-cluster teardown (table + PVC) [DRAFT]#40
aptracebloc merged 1 commit into
developfrom
feat/30-dataset-rm

Conversation

@aptracebloc
Copy link
Copy Markdown
Contributor

@aptracebloc aptracebloc commented Jun 3, 2026

Draft — pending team-lead review of the teardown architecture. Implements #30 (the CLI-direct approach); the central-catalog piece is split out as #39.

Adds tracebloc dataset rm <table> — the in-cluster teardown of a previously-pushed dataset, packaging the manual kubectl exec cleanup we used during v0.1 testing.

What it does

  1. Discovers the cluster (parent release + shared PVC) — same as push.
  2. Shows the teardown plan (the MySQL table + both PVC dirs) + a destructive-action warning.
  3. Confirms — reuses the feat(dataset push): interactive guided mode #28 prompter on a TTY; --yes skips; --dry-run previews and stops; refuses (exit 3) off a terminal without --yes.
  4. DROPs the table (exec into the mysql pod, using its own $MYSQL_ROOT_PASSWORD so no DB credential transits the CLI) and **rm -rf**s the PVC dirs (exec into the jobs-manager pod, which mounts the shared PVC).

Exit codes mirror push: 2 bad name, 3 kubeconfig/refused, 4 no release/PVC, 7 teardown failed mid-flight.

⚠️ The open question (why this is a draft)

The teardown mechanism is exec-into-existing-pods (the "CLI-direct" path). The alternative is a server-side jobs-manager delete-ingestion endpoint that could also remove the backend catalog entry in one place. See the DESIGN NOTE on push.Teardown. This needs your call before it ships.

It also does not remove the central backend catalog entry — the CLI has no direct line to that backend, so a successfully-ingested dataset torn down this way leaves a stale catalog entry. Tracked as #39.

Assumptions worth confirming: a pod whose name contains mysql exposes $MYSQL_ROOT_PASSWORD, and the jobs-manager pod mounts the shared PVC at /data/shared (push.SharedRoot). Both hold for the current parent chart.

Verification

  • make ci green.
  • Unit tests: TestPlanTeardown (the artifact set) + TestRunDatasetRm_InvalidTableExitsTwo (the name guard, no cluster needed).
  • Live --dry-run on EKS dev — discovers release/PVC, prints the plan, deletes nothing:
  Will delete
    mysql table:   training_test_datasets.demo_rm_preview
    pvc path:      /data/shared/demo_rm_preview
    pvc path:      /data/shared/.tracebloc-staging/demo_rm_preview
  • The destructive path was NOT run — that needs your nod + a throwaway table first.

Refs #30, #31, #39.

🤖 Generated with Claude Code


Note

High Risk
Destructive, irreversible deletes against live cluster MySQL and shared PVC; relies on chart-specific pod naming and exec assumptions, and leaves backend catalog entries stale until #39.

Overview
Adds tracebloc dataset rm <table> to tear down artifacts from a prior dataset push: the MySQL table in training_test_datasets and the final + staging dirs on the shared PVC.

The command follows the same flow as push (validate table name → kubeconfig/cluster discovery → pre-flight plan → act), with --dry-run, TTY confirmation (or --yes / refuse off-terminal), and exit codes aligned with push (2 bad name, 3 kube/refused, 4 missing release/PVC, 7 teardown failure).

Teardown is implemented in internal/push via PlanTeardown and Teardown: DROP TABLE by exec into a running mysql pod (using in-pod $MYSQL_ROOT_PASSWORD) and rm -rf on PVC paths via exec into jobs-manager. The central backend catalog entry is not removed (noted as #39 follow-up).

Unit tests cover PlanTeardown artifact paths and early exit 2 for invalid table names without touching the cluster.

Reviewed by Cursor Bugbot for commit 7809580. Bugbot is set up for automated code reviews on this repo. Configure here.

Adds 'tracebloc dataset rm <table>': the in-cluster teardown of a pushed dataset, packaging the manual kubectl-exec cleanup. Discovers the cluster (parent release + shared PVC), shows the teardown plan, confirms (reuses the #28 prompter; --yes to skip, --dry-run to preview), then DROPs the MySQL table (exec into the mysql pod, using its own $MYSQL_ROOT_PASSWORD so no credential transits the CLI) and rm -rf's the PVC dirs (exec into the jobs-manager pod). Exit codes mirror push (2 bad name, 3 kubeconfig/refused, 4 no release/PVC, 7 teardown failed).

Does NOT remove the central backend catalog entry — the CLI has no direct line to that backend; tracked as the cross-repo follow-up #39. The exec-into-pods mechanism (CLI-direct) vs a server-side jobs-manager delete endpoint is the open architecture question — see the DESIGN NOTE on push.Teardown. Hence: draft, pending team-lead review.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@LukasWodka
Copy link
Copy Markdown
Contributor

👋 Heads-up — Code review queue is at 14 / 8

Above the WIP limit. The team convention is to review existing PRs before opening new work.

Open PRs currently in Code review (oldest first):

Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.)

@aptracebloc aptracebloc marked this pull request as ready for review June 3, 2026 20:14
@aptracebloc aptracebloc self-assigned this Jun 3, 2026
@LukasWodka
Copy link
Copy Markdown
Contributor

👋 Heads-up — Code review queue is at 15 / 8

Above the WIP limit. The team convention is to review existing PRs before opening new work.

Open PRs currently in Code review (oldest first):

Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.)

@aptracebloc aptracebloc merged commit c1a0ae0 into develop Jun 3, 2026
16 checks passed
aptracebloc added a commit that referenced this pull request Jun 4, 2026
* feat(#26): installer-style terminal UX package (internal/ui) (#32)

* feat(#26): add internal/ui installer-style terminal UX package

Printer renders colored step/✔/⚠/· output matching the tracebloc/client installer (scripts/lib/common.sh). TTY+NO_COLOR auto-detect with a WithColor override (functional-options); fatih/color forced per-instance so it's testable on a buffer. Helpers: Banner, Step, Successf, Warnf, Infof, Errorf, Hintf, PromptHeader.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(#26): render dataset push pre-flight via internal/ui + add --plain

Adds a persistent --plain root flag and a printerFor(cmd) helper, plus ui.Section/ui.Field. dataset push pre-flight now renders through the ui.Printer (colored Section/Field rows, ⚠ for the RWO warning) instead of ad-hoc fmt.Fprintf; behavior unchanged (the pinned RendersKeyFacts test still passes).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(#26): render cluster info via internal/ui

cluster info now prints through the ui.Printer (Section/Field + a green ✔ ready line) instead of raw fmt.Fprintf, matching the dataset push pre-flight. Test updated for the *ui.Printer signature (still asserts exit-3 on a bad kubeconfig).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(#27): installer-style ingestion summary via internal/ui (#33)

Replaces the box-drawing RenderPanel with RenderSummary(p *ui.Printer, s): an outcome-colored headline (green clean / yellow skips / red failures) + Section/Field counts + a 'what's next' block. Dropping the Unicode box resolves the v0.2 plain-ASCII TODO. submit.Options gains Printer (nil → ui.New(Out)); dataset push threads its --plain-aware printer through. Parser untouched. RenderPanel tests → RenderSummary (basic shape, nil, table-driven outcome).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(#28): interactive guided dataset push — engine + core prompts (PR-a) (#34)

On a TTY (unless --no-input), dataset push now prompts for any missing core inputs (path, category, table, intent, label) before validation; flags already passed win. Introduces a prompter interface (survey-backed in prod, fake in tests) so the prompt-mapping is unit-testable without a pseudo-terminal. Args ExactArgs(1)->MaximumNArgs(1) so a bare command can prompt for the path; explicit path-required error otherwise. MLM skips the label prompt; the table prompt reuses push.ValidateTableName. PR-b adds category-specific prompts + a confirm screen.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(#28): category-specific prompts + confirm screen (PR-b) (#35)

Completes the guided dataset push. After the core prompts, asks only what each category needs: image resolution (blank=auto-detect), required keypoint count, tabular schema (blank=infer) + regression label-policy + time-column. Then a Review screen + 'Proceed?' confirm — shown only when something was actually prompted, so full-flag pushes aren't nagged. Declining or Ctrl-C (survey terminal.InterruptErr, translated to errInteractiveCancelled at the prompter boundary) exits cleanly (0). Prompter interface gains Confirm; fake-prompter tests cover keypoint, tabular-regression, and cancel.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(#29): branded home screen + cluster info banner (PR-a) (#36)

Bare 'tracebloc' (no subcommand) now renders a branded home screen — ui.Banner + a 'get started' command list — via root.RunE, instead of cobra's raw usage dump; subcommands and --help are unaffected. cluster info gains a Banner header. PR-a of #29; --output-json follows in PR-b (it needs runDatasetPush to return a result struct + JSON tags on Summary, so it's its own change).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(#29): --output-json for dataset push (PR-b) (#38)

* feat(#29): --output-json for dataset push (PR-b)

--output-json emits a machine-readable JSON result on stdout and routes all human output (preflight, logs, summary) to stderr, so 'push ... --output-json | jq' is clean. Implies non-interactive. JSON is emitted at the dry-run stop (status=dry-run) and after submit (status=succeeded/failed/detached/unknown + summary counts). The wire shape (pushJSONResult/pushJSONSummary) lives in the CLI layer so submit.Summary stays json-tag-free. Adds printerForWriter for the stderr routing; unit-tested via TestWritePushJSON.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(#29): align --output-json status with exit code + emit JSON on errors

classifyPushOutcome maps submit.Run's result to a (status, exitError) pair in lockstep: JobOutcomeSucceeded + summary.HasFailures() now yields status "completed_with_failures" + exit 9 (was "succeeded" while exiting 9). --output-json now emits exactly one result object on EVERY path — including auth/submit/watch errors (with job namespace+name when available) — instead of nothing on error. Adds TestClassifyPushOutcome. Addresses Bugbot on #38.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(#37): expand ~ in dataset path (interactive prompt + literal arg) (#41)

A leading ~ / ~/… in the dataset path now resolves under $HOME. The shell only expands an *unquoted* ~ on the command line; a path typed at the interactive prompt (#28) or a quoted/literal ~ positional arg reaches the CLI literally, and filepath.Abs would just prepend the CWD (.../cwd/~/...). Adds expandHome (mirrors cluster.expandPath) and applies it in runDatasetPush right after a.LocalPath is resolved — before any push.Discover* call — so both entry points are covered in one place. Relative/absolute/empty paths pass through untouched.

Deferred the optional in-prompt existence validator: it would couple the prompt-mapping unit tests to real directories on disk. Filed nothing — noted in the PR.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(#30): dataset rm — in-cluster teardown (table + PVC) (#40)

Adds 'tracebloc dataset rm <table>': the in-cluster teardown of a pushed dataset, packaging the manual kubectl-exec cleanup. Discovers the cluster (parent release + shared PVC), shows the teardown plan, confirms (reuses the #28 prompter; --yes to skip, --dry-run to preview), then DROPs the MySQL table (exec into the mysql pod, using its own $MYSQL_ROOT_PASSWORD so no credential transits the CLI) and rm -rf's the PVC dirs (exec into the jobs-manager pod). Exit codes mirror push (2 bad name, 3 kubeconfig/refused, 4 no release/PVC, 7 teardown failed).

Does NOT remove the central backend catalog entry — the CLI has no direct line to that backend; tracked as the cross-repo follow-up #39. The exec-into-pods mechanism (CLI-direct) vs a server-side jobs-manager delete endpoint is the open architecture question — see the DESIGN NOTE on push.Teardown. Hence: draft, pending team-lead review.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* ci(release): multi-arch v0.1.0 — add linux/386, linux/arm, windows/arm64 (#43)

Widen the release build matrix from 5 to 8 targets for the first stable
v0.1.0: add linux/386, linux/arm (GOARM=6, covers armv6 + armv7), and
windows/arm64. macOS amd64/arm64 and linux/windows amd64/arm64 unchanged.

- install.sh detect_arch: learn i386|i686 -> 386 and
  armv6l|armv7l|armv8l|armhf -> arm (else auto-install breaks on them).
- install.ps1 Get-Arch: return arm64 (was a hard fail) and prefer
  PROCESSOR_ARCHITEW6432 so x64-emulated PowerShell on a Windows-ARM host
  still picks the native arm64 build.
- README + RELEASE_CHECKLIST: reflect the 8-platform stable release; the
  releases/latest/download/install.sh one-liner now resolves (a plain
  v0.1.0 tag is non-prerelease per release.yml's gate, so it is `latest`).

Verified: all 8 targets cross-compile (CGO_ENABLED=0) to the right arch;
release.yml YAML + install.ps1 parse clean. No Go source changed.

Refs #42

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

* feat(#31): friendlier dataset push — prose intro, guided steps, per-prompt examples (#44)

* feat(#31): friendlier dataset push — prose intro + guided 4-step flow

Replaces the terse bullet header with a plain-English explainer (grounded in docs.tracebloc.io: 'your workspace', 'the cluster your workspace was installed on', 'your data stays on that cluster'), via a new ui.Printer.Para for prose. Restructures the push into a clearly-numbered, narrated 4-step flow — Check your dataset / Connect to your cluster / Stage your files / Run the ingestion — each with a one-line how/why for medium-technical users (steps 1-2 visible in --dry-run, 3-4 live). Splits printPushPreflight into printLocalSummary + printClusterSummary so each sits under its step; the dry-run stop now names the live-only steps it skipped.

dataset rm: drop the backend-catalog clause from the runtime warning (now just 'Destructive and cannot be undone.'); --help NOTE + #39 tracking unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(#31): per-prompt guidance + examples in interactive push

Each interactive input prompt now shows a visible hint with an example above it — previously this guidance was hidden behind survey's '?' key. Covers the core prompts (path, table, intent, label column) and the category-specific ones (keypoint count, image resolution, tabular schema, regression label-policy, time column). Rendered via a new ui.PromptHint: a leading blank line for separation + cyan text, so per-field guidance stands out and reads distinctly from the dim generic hints (e.g. the 'Press Enter' meta-line). Verified by TestRunInteractive_ShowsExampleHints, which drives runInteractive with a buffer-backed Printer — no TTY needed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* ci(build): compile-check all 8 release targets in build.yml (#46)

release.yml builds 8 targets but build.yml (PR CI) only compiled 5, so a
break in linux/386, linux/arm, or windows/arm64 wouldn't surface until a
release tag. Bring build.yml's matrix in lock-step: add linux/386,
linux/arm (GOARM=6), windows/arm64. Compile-only (the native linux/amd64
smoke step is unchanged).

Closes #45

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

* feat(#31): add dataset rm to home screen + in-run guidance (#47)

The bare-'tracebloc' home screen now lists 'dataset rm' alongside push/cluster-info/ingest-validate. The rm command itself gains the same guidance treatment as push: a plain-English Para intro after the banner (what it removes — table + files — and that it can't be undone), and a cyan PromptHint above the confirmation. Home-screen test asserts 'dataset rm' appears.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: prep README for v0.2.0 (#50)

- Status → v0.2.0 (latest stable; builds on v0.1.0 with the friendlier
  guided dataset push + dataset rm on the home screen).
- Reconcile the Roadmap: "Next (v0.2)" framed v0.2 as the cloud-sources
  milestone, but v0.2.0 ships UX polish — so cloud sources / segmentation
  / `list` move to a label-free "Next", and `rm` drops from the future
  verbs (it shipped in v0.1.0).
- Phase-5 table link pinned to the v0.1.0 tag (text said v0.1.0 but
  linked to /latest, which now resolves to v0.2.0).

Refs #48

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

* fix: address Bugbot review on #49 (#51)

- dataset push --output-json: emit a JSON error object on early-failure paths (validation, discovery, staging, token, port-forward), not just on dry-run/post-submit. runDatasetPush now uses a named return + a jsonEmitted flag + a defer, so '… --output-json | jq' always gets JSON instead of empty stdout on a failure. Adds Error/ExitCode to the result shape + writePushErrorJSON; covered by TestRunDatasetPush_OutputJSONEarlyFailureEmitsJSON.

- dataset rm: on a partial teardown (DROP TABLE succeeded but the PVC file removal failed), the error now says so and points the user to re-run — both teardown ops are idempotent (DROP TABLE IF EXISTS / rm -rf), so a re-run completes cleanup rather than reporting a flat 'teardown failed'.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Asad Iqbal (Saadi) <asad.dsoft@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants