Skip to content

feat: Structural Integrity Score + PR gate, and automated-sync#104

Merged
gbrbks merged 22 commits into
mainfrom
feature/architecture-integrity-score
Jun 25, 2026
Merged

feat: Structural Integrity Score + PR gate, and automated-sync#104
gbrbks merged 22 commits into
mainfrom
feature/architecture-integrity-score

Conversation

@gbrbks

@gbrbks gbrbks commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Two related bodies of work land together on this branch (the automated-sync feature, PR #103, was merged into this branch and rides along — flag if you'd prefer them split before merge).

1. Structural Integrity Score + PR gate

A deterministic headline score measuring how well the code upholds the structurally checkable parts of its documented contract — layering, dependency direction, placement, naming, DI wiring, plus whether product laws have an enforcement mechanism. It is not a quality grade and does not judge behavioral/product-law correctness (that stays in the LLM review layer); generic complexity is shown as hygiene only.

  • Pure scoring engine (scoring.py) + CLI (score.py): min(weighted body, geometric correctness-ceiling); worklist-first output (the worklist is the point, the number is the roll-up).
  • Reconciliation model: a divergence is Upheld / Accepted (staged amendment via /archie-sync) / Open — only open grounded divergences in a diff block the gate (exit 1); the number never blocks.
  • Calibration harness: only rules that pass a precision bar (the "smoke-alarm test") are block_eligible; jumpy rules demote to advisory.
  • Surfaced at dev time, scan time (/archie-deep-scan Step 9), and CI; rendered in the terminal, the PR comment, and the viewer (Risks section + sidebar).
  • Divergences are grouped by (file, rule) with a title + detail; .archieignore/.gitignore honored across the worklist and LOC normalization.

2. PR sync-advisory (this session)

A durable, reviewer-visible nudge when code changed without an /archie-sync — the boundary the session hooks can miss.

  • sync.py sync-stamp writes committed .archie/sync_state.json (a content fingerprint of reconciled source).
  • intent_review.py flags PR-changed source files whose current content differs from the last stamp (content-based → rebase/squash-immune; O(diff); honors ignore/SKIP_DIRS; skips deletions; -z so non-ASCII paths aren't dropped). Advisory only — never blocks merge.

3. Carried from PR #103 (automated-sync)

Background hooks accrue churn + captured plans and nudge /archie-sync at turn-end (exit 2, with a stop_hook_active loop guard) and at commit time; the sync skill consumes those signals. Also fixed: the merge had left the npm mirror out of sync (churn-track.sh + stale hooks/manifest/SKILL would have shipped the feature dead via npx) — restored, and verify_sync now guards the hook_scripts subtree.

Verification

  • verify_sync green (scripts + workflow + viewer + hook_scripts mirrors).
  • Full suite green (scoring, calibration, sync, intent_review, automated-sync hooks, asset-sync).
  • The sync-advisory was hardened across two adversarial multi-agent review rounds (selection-mismatch, regressions, an O(repo) perf regression, never-raises) — all confirmed findings fixed with regression tests.

🤖 Generated with Claude Code

gbrbks and others added 22 commits June 24, 2026 11:19
…eep-scan & sync

A deterministic integrity layer: a worklist of open contract divergences (each
file:line + the decision/law it breaks) rolled up into one score. The number is
never the gate — only open grounded divergences in a diff block a PR.

- scoring.py: composite = min(weighted arithmetic body, geometric correctness-
  ceiling over {Reconciliation, Product-Law Coverage}). No floor/drag magic
  constants. Structural Health is an informational panel, not a headline axis.
  Size-normalized axis derivations; absent != a free 100.
- score.py: reads .archie/ artifacts, computes the AIS + worklist, explains the
  context (explain()), renders worklist-first terminal / PR-markdown views,
  persists the committed baseline (score.json + history), and a diff-scoped gate
  (--diff <base>, exit 1 on a grounded divergence in the diff).
- Wired into /archie-deep-scan Step 9 (baseline write + closing-summary line) and
  /archie-sync (integrity standing after record). No new slash command.
- 22 tests; npm-package assets + installer in sync (verify_sync passes).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…/viewer bundle

build_bundle now packages .archie/score.json (headline, worklist of open
divergences, and the plain-language explanation block) as bundle["integrity"],
so the local viewer (/api/bundle) and /archie-share carry the context — not just
a number. Rendering it in the React view is a frontend change (that source lives
outside this repo).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ock a build

The "smoke alarm test": run each rule against known-good and known-bad code
(including near-misses like the forbidden pattern sitting in a comment), measure
precision/recall, and mark a rule block_eligible only if precision >= 0.95.
Jumpy rules degrade to WARN. Reuses check_rules.py (the real gate engine);
labels come from how each case is built, so it's non-circular. The demo catches
a plausible raw-SQL rule that false-fires on a comment (precision 0.5 -> WARN).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The gate reads .archie/rule_calibration.json: a grounded divergence whose rule
failed the calibration (block_eligible false — too jumpy) is demoted to a warning
instead of failing the build. With no calibration data, behavior is unchanged, so
calibration only ever tightens the gate. Adds write_calibration() to the harness.

Demo: a raw-SQL rule that false-fires on a comment (precision 0.5) flips the gate
from BLOCK to PASS-with-warning once calibrated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The god-function and empty-catch platform rules leaked an internal
benchmark name into their user-facing descriptions. Rewrite both to
plain, actionable guidance. The name now lives only in
measure_health.py's internal docstring, never in surfaced text.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…onor ignores

Rename the headline from "Architecture Integrity" to "Structural
Integrity" across the terminal report, PR comment, viewer panel +
sidebar, and the deep-scan/sync workflow docs. The deterministic score
only covers structurally-checkable rules (layering, dependency
direction, placement, naming, DI wiring, law-enforcement presence), and
the name now says so.

- Add an explicit "what this is NOT" (limits) to the explanation: it is
  not a code-quality grade and does not judge behavioral / product-law
  correctness — that stays in the LLM review layer. Surfaced in the
  terminal footer, the PR "how to read this", and the viewer panel.
- Group open divergences by (file, rule) into one worklist entry
  carrying a title + detail + the affected lines/count; render the
  grouped shape in all three surfaces.
- Honor .archieignore/.gitignore in score.py's LOC fallback via
  IgnoreMatcher, matching check_rules' read-boundary. The worklist
  already respected ignores; the size-normalization denominator now
  does too (proven: worklist 1->0 and LOC 5001->0 under .archieignore).
- Move the integrity panel into the viewer's Risks section + sidebar.

The result field stays named `ais` for back-compat with the share
bundle, score.json, and the viewer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t guard

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Automated sync: self-propelling blueprint maintenance (Claude Code + Codex)
…a loop

The automated-sync merge (#103) updated only the canonical archie/assets
+ archie/standalone trees; the npm-package mirror was never synced, so
`npx @bitraptors/archie` shipped the feature dead. Restore the mirror and
close the Stop-hook loop bug.

F1 — npm distribution was broken:
  - churn-track.sh was missing from npm-package/assets/hook_scripts/
    (no churn hook would ship), and verify_sync didn't even check the
    hook_scripts mirror — which is how it slipped through.
  - sync.py (the six new subcommands), the /archie-sync SKILL.md (Step 1b
    + consume-on-success), and manifest_data.py (the churn-track HookDef)
    were stale in the mirror.
  - Adding the hook_scripts mirror check surfaced two more stale hooks the
    merge missed: post-plan-review.sh (plan-capture tee) and
    pre-commit-review.sh (commit advisory).
  Fix: sync all of them; add check_hook_scripts_mirror() to verify_sync so
  a new/edited hook script can never silently fail to ship again.

F2 — the Stop nudge could not be declined:
  stop.sh read no stdin and unconditionally exit-2'd while churn was
  crossed, ignoring stop_hook_active. When the agent declined and tried to
  stop again, the nudge re-fired and re-blocked — an indefinite loop that
  defeated the "Decline if nothing is worth recording" affordance.
  Fix: read the Stop envelope and exit 0 when stop_hook_active is true
  (nudge once per stop attempt). Regression test added.

verify_sync green (now incl. hook_scripts); 34 sync/hook tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a durable, PR-visible reminder that the Living Blueprint may be stale —
the boundary the session hooks can miss (small work, declined nudges, or a
session that never hit a stop hook).

Detection is content-based (Option B), so it survives rebases/squashes and
works whether the synced changes were committed or not:

- `sync.py sync-stamp` writes committed `.archie/sync_state.json` — a sha1
  fingerprint of every source file the sync reconciled (honoring
  .archieignore/.gitignore). Wired into the sync SKILL's consume-on-success
  step next to plan-consume / churn-reset.
- `intent_review.py` (the PR action) computes `sync_advisory()`: the PR's
  changed source files whose CURRENT content differs from the recorded sync
  (or all of them, if no sync was ever recorded). It posts a non-blocking
  "run /archie-sync" section in the existing review comment — and now runs
  even when there's no blueprint diff (the exact case it must catch), which
  previously short-circuited.
- Shared `_common.file_sha1` / `source_fingerprint` so the stamp and the
  check agree on a file's identity by construction.

Advisory only — never blocks merge, consistent with Archie's hook discipline.
`sync_state.json` is a committed output (not gitignored), so it travels with
the PR for CI to read.

Tests: sync-stamp fingerprint + the synced/drift/no-marker advisory paths +
section rendering. verify_sync green; 113 tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adversarial review (15 confirmed findings) surfaced one real bug cluster
plus three regressions from the previous restructure. Fixes:

Selection mismatch (the real bug — flagged source files forever):
- sync_advisory now derives its candidate set from source_fingerprint —
  the SAME universe the stamp records — so a tracked source file under a
  SKIP_DIRS dir (vendor/, Pods/, dist/) or a gitignored/.archieignore'd
  path is no longer flagged "unsynced" on every PR with no way to clear it.
- That intersection also drops deletions for free (gone from the universe),
  so a removed file is no longer surfaced as a phantom "re-sync this path".
- The diff now uses `-z`, so non-ASCII paths aren't C-quoted → dropped
  (was a silent false-NEGATIVE: drift in such files was never reported).

intent_review regressions:
- Model-call failure on a real blueprint diff now renders an explicit
  "Intent review could not run" notice instead of a clean-looking review.
- The review section always renders when the blueprint changed, so a later
  advisory-only run can't silently erase a prior review's context.
- The sync advisory is computed before the blueprint guards, so it still
  surfaces when the branch blueprint is absent or malformed.

sync-stamp hardening:
- Returns non-zero on failure (was exit 0 — a skipped stamp looked like
  success). Atomic write (temp + os.replace). sort_keys to kill os.walk
  ordering churn in the committed JSON. Warns on an empty fingerprint.

verify_sync: check_hook_scripts_mirror now walks the whole subtree
(rglob + is_file), matching what the npx installer ships — not just *.sh.

Tests: ignored/SKIP_DIRS exclusion, deletion skip, model-failed notice.
verify_sync green; 116 tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… cruft filter

A re-review of the prior fixes found the sync_advisory rewrite had
introduced a perf regression and dropped a robustness guarantee.

- HIGH: sync_advisory walked + hashed the WHOLE repo via source_fingerprint
  on every PR (O(repo)), even for a docs-only diff — a real regression vs
  the old O(diff) path, biting large monorepos under fetch-depth:0. Now it
  classifies only the changed paths with a new per-path predicate
  `_common.is_source_path` (the exact per-file form of source_fingerprint's
  SOURCE_EXTENSIONS + SKIP_DIRS + ignore rules) and hashes only those — back
  to O(diff), still byte-consistent with the stamp, still excludes
  ignored/SKIP_DIRS files and skips deletions.
- MEDIUM: restored the "never raises" guarantee — the whole sync_advisory
  body is now guarded (source_fingerprint/IgnoreMatcher could have raised
  and broken the always-exit-0 Action contract).
- MEDIUM: verify_sync's hook-scripts check now ignores OS cruft
  (.DS_Store, __pycache__, *.pyc, *.tmp) so a stray macOS file can't
  false-positive the sync gate.
- LOW: cmd_sync_stamp cleans up its .tmp file if os.replace fails.

Tests: non-ASCII path (-z fix), main() posting the advisory with no branch
blueprint (#7), and check_hook_scripts_mirror subtree-coverage + cruft
filter. verify_sync green; 119 tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 25, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
archie Ready Ready Preview, Comment Jun 25, 2026 10:56am
archie-viewer Ready Ready Preview, Comment Jun 25, 2026 10:56am

@gbrbks gbrbks merged commit dcb5d3c into main Jun 25, 2026
4 checks passed
@gbrbks gbrbks deleted the feature/architecture-integrity-score branch June 25, 2026 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant