Skip to content

docs: promote Environment Setup overhaul to prod#49

Merged
LukasWodka merged 15 commits into
mainfrom
develop
Jun 5, 2026
Merged

docs: promote Environment Setup overhaul to prod#49
LukasWodka merged 15 commits into
mainfrom
develop

Conversation

@LukasWodka
Copy link
Copy Markdown
Contributor

Promote the validated Environment Setup overhaul from develop to production.

Includes: new structure (Overview, Quick Start, Deployment environments + per-env pages, Operations, Security), workspace terminology, accuracy fixes, and the app=manager log-selector fix. All commands validated against the published chart + AWS CLI model. Video placeholder removed; old Setup Guide dropped from nav.

divyasinghds and others added 15 commits May 29, 2026 16:36
Re-applied on top of current main (original branch fix/robots-txt-block-static-assets
was cut from an old initial-commit state and rebasing produced unrelated conflicts
in favicon/logo/.mintignore/docs.json).

Clarity data shows bots (Apple, OpenAI, Google) spending ~200 requests/week on
/mintlify-assets/_next/static/ JS/CSS chunks. These have zero SEO value.

Adds custom robots.txt that blocks /mintlify-assets/ while keeping the existing
/cdn-cgi/ block and sitemap reference.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ets-rebased

Block bots from crawling Mintlify static assets
Fixes the docs side of data-ingestors#131:

- B2 / B1: section 2 of the declarative path linked out for the
  staging recipe and described `kubectl cp` while the Detailed Setup
  section (further down on the same page) prescribed a host-path
  `cp -R`. Replaced section 2 with an inline host-path recipe that
  matches the Detailed Setup section, and demoted `kubectl cp` to a
  Note for multi-node / EKS deployments. The recipe now uses a
  `<prefix>` subdirectory so the path lines up with the
  `/data/shared/<prefix>/...` style used in ingest.yaml examples.
- C2: section 4 was silent on where CLIENT_ID / CLIENT_PASSWORD come
  from in the declarative path. Added a sentence noting the ingestor
  Pod inherits them from the Kubernetes Secret the parent
  tracebloc/client chart creates in <workspace> at install time —
  no creds are passed on the `helm install` line.
- C5: section 4 mentioned the run-twice rule only as a trailing
  parenthetical. Promoted it to bolded prose and added a worked
  train + test pair (two `helm install` invocations, distinct
  release names + `table:` + `intent:`) so the rule is concrete.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Reword overview sentence: "Regardless of where" → "Whether" (cleaner).
- Move the `--reset-then-reuse-values` caveat above section 1 so the
  warning appears before any commands the user could run, and clarify
  it only applies to upgrades of the parent `tracebloc/client` chart
  (not the `helm install tracebloc/ingestor` runs in step 4).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The templates table in create-use-case/templates.mdx listed 9 supported
tasks but was missing masked_language_modeling, even though the
template exists in data-ingestors. Added the row alongside the others.

Deep MLM-specific guidance (tokenizer.json requirements, validation,
troubleshooting) lives in the data-ingestors template README, where
the TokenizerValidator does.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A reviewer hit this on a fresh MLM install:

  ingest_config validation failed:
    <root>: Additional properties are not allowed ('sequences' was unexpected)
    category: 'masked_language_modeling' is not one of [...]

Both symptoms point at a stale local Helm chart cache that predates
the newer category or schema field. `helm repo update` refreshes the
cache and the next `helm install` picks up the current schema. Added
a Warning callout in step 4 of the Declarative YAML section, scoped
generically (any category / schema field, not just MLM).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…art cache

Companion to data-ingestors PR #133 commit 5550d1a. Reviewer
(@LukasWodka) pointed out the previous Warning was diagnostically
wrong: the schema-validation error comes from jobs-manager's
submit-time check against its own bundled schema, not from the local
Helm chart index, so `helm repo update` is a no-op. The fix is to
upgrade the parent `tracebloc/client` chart so jobs-manager
redeploys with the current schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…staging

docs: make declarative-ingest staging self-contained (data-ingestors#131 B/C)
Independent of the installer namespace change (client#192) -- safe to ship now.

Accuracy:
- Firewall domains: drop the misleading `github.com`; add ghcr.io (ingestor),
  raw.githubusercontent.com (scripts), *.github.io (chart repo).
- Drop the phantom HTTP_PORT/HTTPS_PORT knobs (installer disables ingress).
- Troubleshooting: document the `--diagnose` support bundle.

Copy + structure:
- EKS: "when to use EKS vs local" callout + back-link to setup-guide;
  Quick-vs-Detailed signpost; dropped softeners/verbosity.
- Configuration: installer-vs-Helm audience signpost; tightened verbose lines.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
docs(environment-setup): accuracy + copy/structure (no #192 dependency)
Design preview for the section restructure (additive -- sits alongside the
existing pages). Namespace-agnostic, no #192 dependency.

- overview.mdx: trust-boundary Mermaid diagram, lifecycle-as-trust-story,
  what-stays/leaves table, glossary, "what it touches".
- quickstart.mdx: one-liner with expectations, inspect-first path, signed-CLI
  note, locked-down escape hatch, placeholder for a terminal-cast demo.
- docs.json: both added to the top of the Environment Setup nav.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- TERMINOLOGY.md (repo root): seed terminology sheet -- preferred/avoid terms,
  words to retire, open questions (client vs workspace). Lukas owns + extends.
- overview: fuller "how it works" (deploy -> ingest -> use case -> whitelist
  contributors -> submit/train -> results only); weights shared only if the
  owner allows (admin panel); fixed the client definition; dropped "box".
- quickstart: "no Docker/Kubernetes knowledge needed" reassurance (the installer
  installs Docker); lowered specs to 2 CPU / 4 GB RAM (preflight only warns,
  never hard-fails on RAM/CPU).
- setup-guide: same spec correction (4->2 CPU, 8->4 GB RAM).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…s, Security (mockup)

Additive preview of the restructured section on develop:
- deployment-environments: landing -- comparison table + the shared 6-heading template.
- deploy-local / deploy-bare-metal / deploy-aks / deploy-openshift: per-environment
  guides on the template (EKS reuses the existing detailed guide).
- operations: day-2 -- version, health, logs, stop/start, upgrade, rollback, move,
  uninstall, backup.
- security: dedicated "Security & data handling" page for the data-first audience.
- docs.json: nested "Deployment environments" group + Operations/Security; Overview
  and Quick Start now link to the new landing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Validated against the published chart via `helm template`: the jobs-manager
pod label is `app: manager`, so `-l app=tracebloc-jobs-manager` matched no pods.
Corrected in operations, configuration, and troubleshooting.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Use "workspace" as the user-facing term across the new pages; "client" demoted
  to "Client ID" only. Keeps tracebloc/client (chart) and the clients page (UI).
- Remove the Quick Start video placeholder (re-add when recorded).
- Drop the superseded old Setup Guide from the nav.
- TERMINOLOGY.md: record the workspace decision (option b).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@LukasWodka LukasWodka merged commit 2255743 into main Jun 5, 2026
1 check passed
@LukasWodka
Copy link
Copy Markdown
Contributor Author

👋 Heads-up — Code review queue is at 14 / 8

Above the WIP limit. The team convention is to review existing PRs before opening new work.

Open PRs currently in Code review (oldest first):

Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants