docs: Arlo comparison audit transparency report and guide by nealmcb · Pull Request #2350 · votingworks/arlo

nealmcb · 2026-06-20T20:14:26Z

Adds three planning documents for Arlo comparison audit transparency.

`docs/transparency-report.md`

Analysis of what Arlo currently exports at each phase of a comparison audit, where the transparency gaps are, and prioritized recommendations for closing them. Covers pre-seed, post-seed, and post-audit phases; identifies gaps in machine-readable formats, opportunistic contest risk levels, pre-seed commitment workflow, and per-jurisdiction phase exports.

`docs/transparency-implementation-plan.md`

Phased implementation plan grounded in two principles:

Software independence — the ability to detect voting system errors without relying on the same software stack that produced them. Requires both mechanical verification (anyone can replicate the sample draw and risk calculation from published artifacts) and human verification (physically present observers independently record board interpretations and compare against Arlo's record).

The blind audit principle — audit boards must interpret each ballot without seeing the CVR. Observers follow along silently using a pre-generated excerpt that joins the retrieval list with the CVR; they never show it to the board or speak during the session.

Track A — Observer Toolkit (no Arlo changes required)

A1 Pytest integration test suite: full 2-round audit, saving phase artifacts with SHA-256 hashes at each transition
A2 Official export functions: download and hash phase artifacts for public posting (public posting is required — observers have no Arlo instance access)
A3 Independent observer verification scripts: replicate_sample.py, replicate_risk_level.py, end_to_end_verify.py — replicates sample draw and risk level from published artifacts alone
A4 Test coverage for A3/A5 scripts
A5 Observer excerpt generator: joins retrieval list with CVR to produce a print-ready per-ballot sheet; includes a printed notice reminding observers not to show it to the board

Track B — Arlo Improvements

B1 Machine-readable audit report (JSON endpoint) — may be worth implementing before some Track A work
B2 Opportunistic contest risk levels: compute and export risk for contests that received ballots incidentally; add universe_ballot_count per contest to sample sizes response
B3 Per-contest, per-jurisdiction eligible ballot count endpoint: needed to verify opportunistic risk level denominators — the manifest CSVs give total ballot counts but not the count of ballots containing a given contest, which comes from CVR metadata
B4 Pre-seed hash-index JSON endpoint
B5 Per-jurisdiction phase exports (new data only per phase)
B6 UI transparency checklist panel at each phase transition (soft gate, not a hard block)
B8 Trusted timestamping support (approach TBD)

Also notes CVR anonymization requirements (rare ballot styles < ~10 must be aggregated before public CVR release) and why Arlo server library code reuse is appropriate for observer scripts.

`docs/cloud-testing-deployment-plan.md`

Notes on cloud testing and deployment context for running Arlo in a test environment.

Originally drafted with Copilot. Developed with Claude Code.

Adds docs/transparency-report.md — an analysis of what Arlo currently exports at each phase of a comparison audit, where the transparency gaps are, and prioritized recommendations for closing them. Covers three audit phases (pre-seed, post-seed/pre-comparisons, post-audit), identifies 12 prioritized recommendations, and includes a summary table mapping each transparency need to current Arlo status and the gap. Co-Authored-By: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-Authored-By: Neal McBurnett <nealmcb@gmail.com>

Survey of Heroku (fully supported), VPS, and Docker (absent) deployment options; Cypress E2E and Artillery load-testing tooling; fastest path to a test instance using FLASK_ENV=development + nOAuth. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Two-track plan: Track A (observer toolkit — pytest harness, official export scripts, observer mechanical-verification scripts) and Track B (Arlo improvements — JSON report, opportunistic contest risk levels, sampler-inputs artifact, pre-seed hash bundle, per-jurisdiction phase exports, UI transparency checklist, reproducibility bundle). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add the software-independence framing: computers verifying computers is not sufficient; human observers physically present during audit board sessions are the critical missing link. Add A5 (generate_transcript.py): joins retrieval list with CVR to produce a right-justified per-ballot transcript (matching the rightJustifiedBallotList.pdf format) that observers follow silently during sessions, marking any deviation from what the board says aloud. No writing required — just listening and marking. Post-session comparison against the Arlo audit report can be done entirely on paper. Add blind-audit protocol: audit boards never see the CVR at all; observer transcript must likewise never be shown to boards. Add CVR anonymization context referencing loriinboulder/anonymize_cvr and Branscomb et al. 2018. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Tighten software-independence framing; add Arlo itself to the list of systems that must not leak CVR interpretations to audit boards. Remove the paragraph stating the excerpt generator uses the unredacted CVR — resolved by Q5: excerpt generation must use the publicly-posted (anonymized) CVR, not raw data from the Arlo server. Revise open questions: - Q1: redaction timing and overlap with selected ballots need design - Q3: unaudited contests should prompt "address via other auditing" - Q4 (was "auth"): reframed as data access — observers have no Arlo instance access; export flow must include public posting design - Q5: answered — excerpt generator is an observer-side tool on public CVR - Q6: closed — code audit confirmed the audit board UI does not expose CVR vote choices to boards Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

A5 (generate_excerpt.py): specify the --cvr argument must be the publicly-posted (anonymized) CVR, not a raw Arlo export. Add rare-style redaction context to the missing-imprinted-ID warning (step 6). A2 (export scripts): make explicit that public posting of each phase bundle is a required workflow step, not optional — observers have no Arlo instance access and depend on the public artifacts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- B3 reframed: sampler inputs artifact should include per-jurisdiction eligible ballot counts per contest (needed to verify opportunistic risk levels), not just a JSON repackaging of manifest data - B7 removed: step-by-step observation is preferred over a single reproducibility bundle artifact - B8 simplified: RFC 3161 specifics removed, approach left as TBD - p-value → risk level throughout - deviation → discrepancy throughout - Observer Toolkit scripts: Arlo server library reuse is fine - A2 renamed to Official Export Functions (not just scripts) - Blind-audit principle description softened slightly - Excerpt format updated (underscore separator, longer decorators) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

nealmcb mentioned this pull request Jun 20, 2026

Document and support best practices for running robust transparent reproducible audits #2351

Open

nealmcb and others added 6 commits June 20, 2026 17:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: Arlo comparison audit transparency report and guide#2350

docs: Arlo comparison audit transparency report and guide#2350
nealmcb wants to merge 7 commits into
votingworks:mainfrom
gwexploratoryaudits:docs/comparison-audit-transparency-report

nealmcb commented Jun 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

nealmcb commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

docs/transparency-report.md

docs/transparency-implementation-plan.md

Track A — Observer Toolkit (no Arlo changes required)

Track B — Arlo Improvements

docs/cloud-testing-deployment-plan.md

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nealmcb commented Jun 20, 2026 •

edited

Loading

`docs/transparency-report.md`

`docs/transparency-implementation-plan.md`

`docs/cloud-testing-deployment-plan.md`