Archie Intent Review — PR-time semantic review + snapshot-vs-contract sync#102
Merged
Conversation
Captures the brainstormed design for a PR-time semantic review that checks a branch's folded blueprint/rules diff (branch vs base) against retained invariants, posts an FYI comment, and never blocks. POC scope: GitHub Action + zero-dep script. Documents that /archie-sync already folds into the blueprint on the branch, the Layer 1/2 design, guardrails from the adversarial review, the decisions log, dependency chain, and open questions for planning. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
End-to-end delivery plan (7 milestones, grounded against deep-scan/sync/ distribution internals via a research+critique+revise pass) plus the two canonical setup assets the plan's M4/M5 produce: - docs/archie-intent-review-delivery-plan.md — the plan - archie/assets/workflows/archie-intent-review.yml — the Action (on: pull_request, fetch base ref, runs .archie/intent_review.py) - archie/assets/setup-archie-intent-review.sh — idempotent gh-based one-command CI setup (prereq checks, silent secret via gh secret set, copies the canonical YAML, fork-PR caveat) — no GitHub-web tinkering NOTE: not yet distributed. The file-sync wiring (npm-package/assets copies + verify_sync.py .sh/.yml/plural-workflows checks + archie.mjs/install.py entries) and the review engine intent_review.py are milestones M1a-M3 of the plan. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ment, tests + wiring intent_review.py (zero-dep): deterministic keyed diff of branch-folded blueprint/rules vs base ref, sync-ledger glob (union of change_*.json), conservative ledger join (file overlap AND keyword), one Haiku tool_use call that JUDGES only (script overwrites diff_op/ids/layer), because-or-suppress, upserted FYI PR comment, always exit 0 (never blocks; fork/no-secret early skip). Wiring (M1a): archie.mjs + install.py (+ mirror) script lists; verify_sync.py now byte-checks the plural workflows/ mirror and the setup .sh (drift-tested). Tests (M7): 25 cases, all green; full suite 1003 passed / 1 skipped; verify_sync green. Workflow YAML + gh setup script are the M4/M5 canonical assets. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…, coverage From a 6-agent adversarial verification pass: Blockers: - never-block guarantee: route both comment posts through safe_post_comment() which swallows URLError (covers HTTPError) + OSError — a network hiccup no longer fails the Action. - data-section diff: removed a dead `pass` that let pure-ADD data models through unintentionally; the script now surfaces every data change and the model judges. Coverage / faithfulness (plan-required sections): - diff platform_rules.json too (unioned with rules.json). - diff pitfalls (id), decisions.trade_offs + out_of_scope (title-hash). - unenforced_invariants deliberately excluded (advisory gaps) — documented. - item_key falls back to a full-item hash so title-less items don't collide. - comment lookup follows Link-header pagination (no dup comment past 100). - workflow: continue-on-error on the base-ref fetch. Distribution: - npx installer now places setup-archie-intent-review.sh and workflows/ into .archie/, so `bash .archie/setup-archie-intent-review.sh` works post-install and resolve_workflow_src() finds the canonical YAML. Tests: +18 (43 total) — new sections, pagination, URLError-swallow, retry/backoff, flag order, path-overlap, item_key fallback, and a full main() integration via a real origin clone. Full suite 1021 passed / 1 skipped; verify_sync green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… coverage Validation review (does it deliver the value, not just pass tests) surfaced the linchpin risk and a coverage gap: - Base ref: diff against github.event.pull_request.base.sha (always in merge-ref history with fetch-depth:0) instead of origin/<base>. Removed the fragile `git fetch` + continue-on-error step that could silently degrade the diff to "everything is new" and post a confident-but-wrong review. - fetch_base_file now distinguishes "file genuinely absent at a valid ref" (legitimate all-ADD) from "ref unresolvable" — the latter posts a loud "review skipped" note rather than a misleading all-new result. - Coverage: components[] now diffed (keyed, Layer 2) so component removal / responsibility changes are caught; communication/descriptive snapshots remain deliberately out of POC scope, now documented (not a silent divergence). Tests: +3 (45 total) incl. unresolvable-ref-is-error, component-remove, and the main() integration now drives the base-SHA path via a real origin clone. Full suite 1023 passed / 1 skipped; verify_sync green. Delivery plan §10 records the amendments. M6 dogfood remains the user's real-repo step. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
.archie/*.py is gitignored (regenerated by the installer), so intent_review.py would never reach CI — the Action runs `python3 .archie/intent_review.py` where no Archie install exists. Add it to the committed hook-runtime exception set (alongside _common/lint_gate/align_check/arch_review) so it ships in the repo. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…iding rules Real-repo test produced 8 near-identical findings (2 functions x 4 rules) for a single change (cap 7->12). Now the model emits ONE finding per distinct change, spanning multiple item_refs and listing ALL colliding_rules in a single cited because. A dedup backstop merges any findings the model still splits (same type + same colliding-rule set). Render shows "<change> (op, Layer N · K sites) — Collides with: rule1, rule2…". 8 -> 1. Tests: +2 (consolidate-across-items, dedup-merges-split); 47 in-file, full suite 1025 passed / 1 skipped; verify_sync green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Phased plan to split the blueprint into a mirror (sync auto-updates from the diff) and a contract (the law — deliberate edits only), so Intent Review reliably catches drift and distinguishes violation from intended amendment via the diff, not commit prose. Phase 1: sync code-fold becomes contract-readonly. Phase 2: deliberate amendment path. Phase 3: review labels amendment vs violation. Grounded in sync.py _SECTION_MAP/fold-context/fold-apply + the cap worked example. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
No manual rule-editing: the human only Fixes (code complies) or Accepts (merge). Accept's contract change is auto-drafted by the system (affected + interlocked rules) and applied on merge — the "auto-drafted amendment" from the original brainstorm. The merge-vs-fix choice is the intent signal. Phase 3 drafts the amendment + presents Fix-or-Accept; Phase 2 applies the draft (in-PR suggestion or on-merge reconcile). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Phase 1 (sync code-fold contract-readonly: gate _KIND_TARGET to mirror kinds + byte-identity guardrail + rendered-doc audit + SKILL/tests), Phase 3 (review drafts the consistent interlock amendment + Fix-or-Accept render + consistency check), Phase 2 (apply the draft on Accept via in-PR suggestion or on-merge reconcile). Grounded in sync.py _KIND_TARGET/fold-context/fold-apply and intent_review.py. Ship order 1 -> 3 -> 2. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Goal narrowed to 'reliably see deviations on the PR'. The review already shows them; the only remaining gap is Phase 1 — stop sync's code-fold from silently moving the rules to match the code (which would hide a deviation). Drafting + Fix-or-Accept + on-merge apply are deferred until the seeing-problems loop is solid. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tract) A code-fold must never silently move the law, or a real code-vs-contract deviation would be hidden from the PR Intent Review. - _classify: advisory kinds (decision/pitfall/rule/guideline) are ALWAYS `staged`, never eligible/folded — they're the contract. Only the descriptive mirror folds. - fold-context: surfaces advisory claims as `staged_amendments` (proposed, not folded) and snapshots a contract fingerprint (invariant sections + rules.json + platform_rules.json). - fold-apply: refuses (before render) any fold that changed the contract fingerprint. - SKILL.md Step 4: edit the mirror only; advisory = proposed amendments. Tests: advisory-always-staged, advisory-not-a-fold-target, contract guardrail aborts on rules.json and invariant edits; updated the fold tests that encoded the old advisory-folds behavior. test_sync 21 pass; full suite 1028 passed/1 skipped; verify_sync green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Rules legitimately change during real work — the guardrail must not refuse them. fold-apply no longer aborts when a fold also changed the contract; it proceeds and reports contract_changed + a note, so the law can move DELIBERATELY but never SILENTLY. advisory->staged still stops AUTOMATIC contract moves; deep-scan and deliberate edits change rules as before. SKILL updated. test_sync 21 pass; full suite green; verify_sync green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Audit gaps:
- contract fingerprint now covers the prescriptive blueprint sections too
(development_rules / infrastructure_rules / architecture_rules), not just the
invariants + rule files — these are the law-in-the-blueprint and no descriptive
kind targets them, so it's safe.
- corrected stale comment ("rule is the only kind that edits rules.json") and the
SKILL eligibility line to state advisory kinds are ALWAYS staged.
test_sync 22 pass; full suite green; verify_sync green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
A new Intent Review GitHub Action that flags, on a PR, when a change deviates from the architectural source of truth — "you're breaking rule XY" — plus the snapshot-vs-contract refinement to
/archie-syncthat makes those deviations reliable.The original idea: use the blueprint/rules as a baseline; when a dev finishes work, diff their change against the single source of truth and flag it on the PR; merge = the new baseline.
How it works
On
pull_request, the Action (intent_review.py):.archie/blueprint.json+rules.jsonagainst the PR base SHA (deterministic, keyed diff — script owns what changed)..archie/changes/) for corroborating intent.Snapshot vs. contract (sync Phase 1)
/archie-sync's code-fold now reconciles the descriptive mirror only and never silently moves the contract (rules/invariants):decision/pitfall/rule/guideline) are alwaysstaged, never auto-folded;contract_changedwhen the law is changed deliberately — visible, not blocked.This guarantees a real code-vs-law deviation always reaches the PR instead of being papered over.
Contents
archie/standalone/intent_review.py— the zero-dep review engine (+ npm asset)archie/assets/workflows/archie-intent-review.yml— the Action (checkout@v5/setup-python@v6)archie/assets/setup-archie-intent-review.sh— one-commandgh-based setup (no GitHub-web tinkering)archie/standalone/sync.py— snapshot-vs-contract Phase 1scripts/verify_sync.py,archie.mjs,install.py— distribution wiring (intent_review committed via gitignore exception so CI can run it)docs/— design doc, delivery plan, snapshot-vs-contract plan + implementationtests/test_intent_review.py(47) +tests/test_sync.pyupdatesValidation
verify_syncgreen.Known limitations / deferred (by design)
/archie-syncto have run; a raw code PR with no sync is invisible (Layer-3 raw-code reading deferred behind an eval gate).🤖 Generated with Claude Code