A Claude Code skill that performs rigorous adversarial verification across code, architecture, data, documentation, tests, and analysis using Chain-of-Verification (CoV) enhanced with abstractive red-teaming, hidden behavior probing, stress techniques, and tri-modal reasoning.
When invoked, this skill launches a skeptical verifier agent that follows a structured protocol:
Pre-verification (Steps 0–0b):
- Identify what needs verification — code, architecture, data, documentation, tests, or analysis
- Gather artifacts — the actual outputs to verify
- Establish ground truth — what to verify against
Chain-of-Verification (Steps 1–2b):
- Decompose artifacts into individual verifiable claims
- Classify reasoning mode — deductive, inductive, or abductive per claim
- Generate adversarial questions for each claim ("what would make this fail?")
- Abstract to failure categories — find patterns, not just individual bugs
Deep Verification (Steps 3–3d):
- Independently verify each claim by tracing actual paths
- Probe for hidden behaviors — detect what the code doesn't advertise
- Apply adversarial scaffold — suspicion modeling, attack selection, subtlety detection
- Stress test — Existence Question, Scale Shift, Time Travel, Requirement Inversion
Reporting (Steps 4–5):
- Report findings with reasoning-aware confidence scoring and anti-fabrication discipline
- Survived verdicts — stress tests that hold are as valuable as those that break
- Hypotheses — abductive findings reported separately, with alternatives and tests
- Propose project doc updates — TODO.md, SPEC.md, PLAN.md (with user confirmation)
| Domain | What it verifies | Ground truth |
|---|---|---|
| Code | Source changes, logic, behavior | Tests, type system, spec |
| Architecture | Design decisions, spec coverage | Requirements, constraints, patterns |
| Data | Schemas, migrations, contracts | Production schema, validation rules |
| Documentation | Technical, process, and user-facing docs | Actual codebase, current API, git history |
| Tests | Test suite integrity and honesty | Production code, requirements, coverage reports |
| Analysis | Agent outputs, reports, docs | Source material, cited references |
Instead of finding individual bugs, identifies failure categories — general patterns that produce bugs repeatedly. Searches the entire codebase for instances of the same pattern (frequency assumptions, implicit ordering, stale state, missing completeness, silent fallthrough, assumed environment).
Hidden Behavior Probing
Detects behaviors the code doesn't advertise using four probing strategies: indirect probing (trace actual execution), scaffolded probing (chain findings), cross-reference probing (claims vs reality), and absence probing (what's NOT there).
Decomposes the adversarial process into five modules: suspicion modeling (what would a reviewer miss?), attack selection (highest-risk claims first), plan synthesis (multi-step trace chains), execution (actually read the code), and subtlety detection (code that hides complexity).
Inspired by Principles of Chaos Engineering, adapted for review. Four techniques with forced variety (minimum 3 per run, never repeat): Existence Question (should this exist at all?), Scale Shift (what at 10x? at zero?), Time Travel (what in 6 months?), Requirement Inversion (what if the opposite?). Produces Survived: yes/no verdicts — knowing what's robust is as valuable as knowing what's fragile.
Each claim is classified by reasoning mode — deductive (verify against ground truth), inductive (generalize from 3+ instances), or abductive (generate best explanation from observations). Abductive findings are reported as hypotheses with alternative explanations and proposed tests, never as verified facts.
Before claiming something doesn't exist, you must state where you looked. Confidence scoring is tied to reasoning mode: deductive (80-100, source cited), inductive (60-79, 3+ instances), abductive (40-59, hypothesis with alternatives). Hard constraint: no score above 79 without citing specific file/line/doc.
When reviewing output from another AI agent, checks for: sycophantic deference, hidden agenda, anchoring bias, confabulated confidence, premature convergence, and evidence cherry-picking.
Copy the skill directory:
cp -r skills/adversarial-verify ~/.claude/skills/Or clone and copy:
git clone https://github.com/fullo/claude-adversarial-skill.git
cp -r claude-adversarial-skill/skills/adversarial-verify ~/.claude/skills/Or install from the marketplace (recommended):
# Add the marketplace (once)
claude plugin marketplace add fullo/claude-plugins-marketplace
# Install the plugin
claude plugin install adversarial-verify@fullo-pluginsclaude plugin update adversarial-verify@fullo-pluginsThe plugin system uses git commit hashes as versions. There is no automatic update notification: run the command above periodically to stay current.
In Claude Code, type:
/adversarial-verify
Or ask naturally:
"run an adversarial review on my recent changes"
"CoV check the last commit"
"verify this code with total skepticism"
"verify the PLAN.md against the SPEC.md"
"adversarial check on this migration"
"verify this agent's analysis report"
"look for systemic failure patterns in the codebase"
"probe this function for hidden behaviors"
"verify the README matches the actual install process"
"verify the tests actually test what they claim"
"stress test the auth module"
"what happens at 10x scale?"
"check if the planning agent's output is biased"
- Silent data corruption — values that look correct but aren't
- Logic flaws — code that passes simple tests but fails edge cases
- Initialization order bugs — field A used before field B is set
- Concurrent modification — adding to a list while iterating it
- State leaks — data persisting across frames/calls when it shouldn't
- Boundary conditions — off-by-one, coordinate system errors
- Resource exhaustion — unbounded lists, missing cleanup
- Spec drift — implementation diverges from SPEC.md
- Missing constraints — PLAN.md doesn't address known edge cases
- Over-engineering — abstraction without justification
- Dependency risk — new deps without evaluation
- Breaking changes — API contract violations
- Schema inconsistency — migration doesn't match model
- Data loss risk — destructive migration without backup
- Constraint gaps — missing NOT NULL, FK, uniqueness
- Backward compat — old code reading new schema
- Stale instructions — install/setup steps that no longer work
- API drift — documented endpoints don't match implementation
- Missing docs — new features with no documentation
- Broken examples — code samples that don't compile or run
- Misleading error messages — error text doesn't match error condition
- Version mismatch — docs reference old versions or deprecated features
- Orphaned references — links to removed files or dead URLs
- UI copy drift — help text diverges from actual behavior
- Tautological tests — assertions that are always true regardless of code
- Mock leakage — tests verify the mock, not the actual behavior
- Coverage lies — line-covered but branch-untested code
- Missing negative tests — only happy path tested
- Fragile assertions — pass by coincidence (order, timing, locale)
- Test-code drift — tests written for a previous code version
- Flaky indicators —
sleep(),retry,@Ignore/skip
- Hallucinated facts — claims without traceable source
- Stale references — citing removed/renamed code
- Logical leaps — conclusion doesn't follow from evidence
- One-sided evidence — only supporting data, contradicting findings omitted
- Sycophantic deference — agrees without challenging assumptions
- Hidden agenda — favors one approach without justification
- Anchoring bias — first evidence disproportionately shapes conclusions
- Confabulated confidence — high confidence on weak evidence
- Premature convergence — jumps to one hypothesis
- Evidence cherry-picking — selects only supporting evidence
Optionally integrates with a multi-agent trust scoring system:
- Each confirmed bug: -1 trust to the agent that wrote it
- Each false positive: -1 trust to the verifier
- Every 3 clean reviews: +1 trust to the developer
Track trust in .claude/agent-trust.json:
{
"agents": {
"dev": { "trust": 7, "clean_commits": 0 },
"test": { "trust": 7, "clean_commits": 0 }
}
}Follows the Agent Skills format and works with Claude Code, Cursor, Windsurf, Cline and other compatible agents.
- Chain-of-Verification (CoV) — Dhuliawala et al., 2023
- Automated Auditing — Anthropic, 2025
- Abstractive Red-Teaming — Anthropic, 2026
- AuditBench — Anthropic, 2026
- Strengthening Red Teams — Anthropic, 2025
- Principles of Chaos Engineering
Extracted from the Rainbow Climb game development project, where it was used to catch critical bugs including:
- Timer-based continuous fire (shootTimer never reset)
- Patrol boundary flip-flop (velocity inverted every frame)
- Missing collision bounds (collectibles had 0x0 rectangles)
- Shield absorption blocking subsequent projectile checks
MIT