Trustabl is a static analyzer for agent reliability. It parses an agent-SDK
repository (Claude Agent SDK, OpenAI Agents SDK, Google ADK, MCP, LangChain /
LangGraph, CrewAI, AutoGen / AG2, Pydantic AI, and the Vercel AI SDK), models the
tools, agents, subagents, skills, slash commands, and plugin manifests it
declares, and checks them against a catalog of reliability and safety rules. It reports the weaknesses it finds — each
with an explanation, a suggested fix, and a confidence score — as a
human-readable summary, JSON, or SARIF 2.1.0, plus a per-surface reliability
score and a CI-friendly exit code. It ships as a single Go binary with no
hosted service: it runs as a CLI, or as a local stdio MCP server
(trustabl mcp) that exposes the same scan to MCP clients without opening a
network port.
The rest of this document explains what Trustabl reasons about and how the scan works, then covers building and running it. For the full implementation reference see ARCHITECTURE.md; for the at-a-glance SDK coverage matrix see COVERAGE.md.
Trustabl does not treat a repository as one undifferentiated blob. Every rule is classified into exactly one of five scopes, and each scope receives a different typed input:
tool— fires once per tool definition. Input: aToolDef(a@function_tool/@tool/@claude_toolfunction, a Claude TStool(name, description, schema, handler)factory call, aFunctionTool(fn)ADK wrapper, an@server.toolMCP registration, or a bare shell-invoking function) plus its parsed file. Catches a missing docstring, an HTTP call with no timeout, untyped parameters, or an unnormalized path flowing intoopen(). (Hosted tools likeWebSearchTool()are agent-scope edge data, captured asHostedToolDef, notToolDef.)agent— fires once per agent declaration. Input: anAgentDef— a PythonAgent(...)/SandboxAgent(...)/AgentDefinition(...)call, a Claude TS typed-constAgentDefinition, a Claude TS sub-agent inline inoptions.agents, or the Claude TSquery(...)main-thread agent (QueryMainAgent) — with every constructor kwarg captured and its edges to tools, handoffs, and guardrails resolved. Catches an agent with shell tools and noinput_guardrails,tool_use_behavior="stop_on_first_tool"paired with filesystem-touching tools, or a main-thread agent with unrestrictedallowedTools.subagent— fires once per Claude Code subagent markdown declaration. Discovery is hybrid: canonical.claude/agents/*.md(any path depth, monorepo-safe) PLUS a frontmatter-shape fallback over all markdown files (gated onname+tools/model) that catches flat-collection repos which ship subagents undercategories/*.md,plugins/<x>/agents/*.md, or similar layouts. Input: aSubagentDefparsed from frontmatter —name,description,tools[](verbatim) +ToolGrants[](parsed permission grammar),disallowedTools,model,permissionMode(incl.bypassPermissions),mcpServers,skills,isolation,hasHooks. Catches a subagent granted the built-inBashtool despite a read-only description (CSDK-110). Subagent presence alone contributesclaude_agent_sdktoSDKsDetected, so the Claude pack loads and CSDK-110 fires on pure-markdown subagent collections.skill— fires once per Claude Code skill (SKILL.md, any path depth). Input: aSkillDefparsed from frontmatter —name,description,allowed-tools→ToolGrants[],disable-model-invocation— plus body facts (dynamic-context exec commands, external URLs, prompt-injection markers) and a bundled-file inventory. Catches a skill that auto-approves unrestrictedBash(CSKILL-001), runs a dynamic-context command that performs network egress or reads secrets before the model sees it (CSKILL-003), or is model-invocable while granting side-effecting tools (CSKILL-050). Skills are markdown, so skill rules carry nolanguage:; theclaude_skillpack loads whenever aSKILL.mdis present.repo— fires once per scan against the whole inventory. Catches project-wide gaps such as the OpenAI Agents SDK being present with no custom trace processor configured.
A repo can declare zero, one, or many agents, across one or more SDKs. Two agents in the same repo can be in completely different security postures — one wired with input/output guardrails, the other not. Agent-scoped findings therefore attribute to a specific agent at its constructor call site; flattening them to a single repo-level verdict would lose that attribution and be wrong. Discovery builds a small per-repo graph (tools, agents, subagents, and the edges between them) so agent-scope and subagent-scope rules can query it.
A Claude-SDK rule and an OpenAI-Agents-SDK rule that detect the same
conceptual problem (a missing timeout, say) are two separate rules with
SDK-specific explanation and fix text — there is no cross-SDK casting.
When a repo declares agents from multiple SDKs side by side, each agent is
checked only against the rules for the SDK that declared it. The same
holds across languages: a language: python rule will not fire on a
TypeScript agent.
trustabl scans in four steps. Each step's output is the typed input to the next, with no shared state between runs — and the inventory the early steps build is what makes policy selection data-driven rather than statically configured.
The binary ships with no embedded rules. Before the pipeline runs,
Trustabl resolves its detection rules from a separate git repository
(trustabl-rules) —
fetching the latest, caching the clone locally, and falling back to the
cache when the network is unreachable. This decouples rule updates from
binary releases: rules can be added or changed without rebuilding the
scanner. The resolved rules commit is recorded in the result and folded
into the ScanID, so a scan is honest about which rules produced it.
If no rules can be fetched and none are cached, the scan exits 2 and
tells you to run trustabl rules pull — Trustabl never runs rule-less.
flowchart LR
target[("Agent repo<br/>(local path or GitHub URL)")]
recon["Recon<br/>files · SDK deps"]
inv["Inventory<br/>Python + TS AST:<br/>tools · agents ·<br/>subagents · MCP servers"]
pol["Policy selection<br/>load rules per<br/>detected SDK ·<br/>META findings"]
ana["Analysis<br/>tool · agent · subagent ·<br/>repo detectors"]
score["Scoring<br/>per-surface score ·<br/>overall readiness"]
out[("ScanResult<br/>findings · scores<br/>(human / JSON / SARIF)")]
target --> recon --> inv --> pol --> ana --> score --> out
- Recon — walk the repo and answer "what's in here" cheaply, without
parsing any source language: languages present (by extension), SDK
dependencies declared in manifests (
pyproject.toml/requirements.txt/Pipfile/poetry.lock/package.jsonfor theclaude-agent-sdk/@anthropic-ai/claude-agent-sdk/openai-agents/@openai/agents/google-adk/@google/adkneedles), the file inventory, and discovered agent components (MCP configs, hook scripts,CLAUDE.mdandAGENTS.mdguidance docs,.claude/agents/*.mdsubagents at any depth,SKILL.mdskills, slash commands at both.claude/commands/*.mdand<plugin-root>/commands/*.md,.claude-plugin/{plugin,marketplace}.jsonmanifests, sandbox policies). No tree-sitter parses happen here — this step decides whether the expensive AST work is even worth attempting. - Inventory — for each language Recon cleared, do the AST work and
extract a typed inventory:
ToolDefs with their config and body facts,AgentDefs with all kwargs captured,SubagentDefs /SkillDefs /SlashCommandDefs /PluginManifests parsed from markdown and JSON frontmatter,MCPServerDefs, guardrails, sessions, and the resolved edges between agents and the tools/guardrails they reference. Detectors read fields off these structs — they never re-parse raw source. - Policy selection — load only the rule packs for SDKs actually
observed in code. An SDK seen in code with no shipped pack emits a
META-001info finding ("Trustabl does not currently audit this SDK") — silence on an unknown SDK is wrong. A dep declared but never used in code emits a different info finding flagging the drift. - Analysis — run the selected scope-aware detectors against the inventory. Findings carry the scope they fired at and attribute to the right location: tool file/line, agent call site, subagent markdown file, or the manifest.
Three properties fall out of this staging, by design:
- Performance. A repo with no Python skips Python AST work; a repo with only Claude TS code skips Python AST work AND OpenAI policy loading.
- Honest coverage. An "unaudited SDK" info finding is louder than a
zero-findings clean bill of health on an SDK Trustabl doesn't know. A
META-004finding further distinguishes "audited and clean" from "could not audit — discovery extracted nothing a rule targets." - Determinism is a contract. Same inputs → same
ScanID, and the report is byte-stable across runs (findings sorted by(RuleID, FilePath, Line), inventory slices sorted deterministically). CI consumers can diff scans without spurious churn.
See ARCHITECTURE.md § 2 for the full diagram with typed inputs at each step.
Tool/agent AST discovery is wired for:
- Python — Claude Agent SDK (decorators), OpenAI Agents SDK, Google
ADK, LangChain / LangGraph, CrewAI, AutoGen / AG2, and Pydantic AI.
Discovery extracts tool definitions, agent constructors, hosted
tools, MCP servers, guardrails, sessions. The bare
Agent(...)constructor shared by OpenAI / ADK / CrewAI / Pydantic AI is import-gated per SDK so the classes never cross-match, and the shared@tooldecorator is routed to the owning SDK by its import binding. - TypeScript — Claude Agent SDK (the
tool()factory, thequery()main-threadQueryMainAgent, inline-in-query()sub-agents, typed-constAgentDefinitions,createSdkMcpServerand the fouroptions.mcpServersconfig literals), OpenAI Agents SDK (thetool({...})factory,new Agent({...})andAgent.create({...}), 9 hosted-tool factories, MCP server classes across 3 transports plus theMCPServerswrapper, 4defineXguardrail factories, and theMemorySession/OpenAIConversationsSession/OpenAIResponsesCompactionSessionsession classes — gated on imports from@openai/agents,@openai/agents-core, or@openai/agents-openai), and Google ADK (thenew FunctionTool({...})constructor, 5 agent constructors —new LlmAgent({...})/SequentialAgent/ParallelAgent/LoopAgent/RoutedAgent— 13 hosted-tool classes, andsubAgentsedges — gated on imports from@google/adk), LangChain / LangGraph (thetool(fn, {...})factory,DynamicStructuredTool/DynamicTool, andcreateReactAgent/createAgent/new AgentExecutor— gated on the@langchain/*/langchain/langgraphecosystem), and the Vercel AI SDK (thetool({...})/dynamicTool({...})single-object factory, the call-basedgenerateText/streamText/generateObject/streamObjectagents and the classToolLoopAgent/Experimental_Agent, withtoolswalked as an object/record, plus the<provider>.tools.*()hosted tools — gated on the bareaiimport). Handles.ts/.tsx/.mts/.ctsplus JavaScript.js/.jsx/.mjs/.cjswith thetree-sitter-typescriptandtree-sitter-tsxgrammars (JavaScript routes to the tsx grammar — a JS superset — and is audited by the samelanguage: typescriptrule packs). TypeScript rule packs ship for the Claude Agent SDK (CSDK-010/011/012/013/014/016 tool rules; CSDK-120/130/131 agent rules), OpenAI Agents SDK (OAI-016/017/019/022/024 tool rules; OAI-105 agent rule), Google ADK (ADK-013/015/016 tool rules; ADK-109 agent rule), MCP (MCP-011/012/013/014 tool rules), LangChain (LC-010/011/012/013/014 tool rules; LC-111 agent rule), and the Vercel AI SDK (VAI-001..008 tool/agent rules; VAI-012 repo rule). A TS repo for any of these no longer produces a blanketMETA-004; seeCOVERAGE.mdfor the full matrix.
JavaScript (.js / .jsx / .mjs / .cjs) is AST-parsed through the shared
TypeScript-family pipeline: its tools and agents are discovered, tagged
javascript, and audited by the language: typescript rule packs (both ES
import and CommonJS require() bindings are recognized). Go has
tree-sitter-go discovery for MCP tools (mark3labs/mcp-go and the official
modelcontextprotocol/go-sdk), audited by the language: go rules in the MCP
pack. C# has tree-sitter-c-sharp discovery for the official ModelContextProtocol
SDK's [McpServerTool] methods, audited by the language: csharp rules. PHP has
tree-sitter-php discovery for #[McpTool]-attributed methods (official mcp/sdk
and community php-mcp/server), audited by the language: php rules. Rust has
tree-sitter-rust discovery for the official rmcp crate's #[tool]-attributed
methods (descriptions read from the description = "..." arg or the /// doc
comment), audited by the language: rust rules; other Go, .NET, PHP, and Rust
SDKs are recognized as files by Recon but not yet AST-parsed.
The rule schema's language: field gates per-language rule sets.
- LLM enrichment is a separate post-scan step (
trustabl enrich). Rule-based detection (trustabl scan) makes no network call — there is no LLM involved in the scan itself.trustabl enrichreads the scan output and calls Anthropic with BYOK (key stored viatrustabl llm key setat~/.config/trustabl/keys.json, mode 0600). The Anthropic call carries a request timeout, and--applyrewrites a file only when its current contents still match what the model reviewed (writing a.trustabl.bakbackup first) — a stale scan is skipped, never mis-applied. - Confidence scores are heuristic, not LLM-judged, and not yet calibrated against a labelled real-agent corpus — treat findings as signal to investigate.
- The CLI is the surface. No web app, API server, or hosted service:
pipe
--format jsonor--format sarifinto your own automation. On GitHub Actions,trustabl/trustabl-actionwraps the scan and uploads SARIF to the Security tab for you; for any other CI,--format sarif --output <file>produces a SARIF 2.1.0 report that feedsgithub/codeql-action/upload-sarifor any SARIF-aware step.
Trustabl is a detect-and-report tool: it does not write or modify any
files in the scanned repo. Each run produces a ScanResult containing:
- Findings — one per rule hit, each with
severity,confidence, anexplanation, asuggested_fix, and the location it fired at (tool file/line, agent call site, subagent file, or the manifest). - Per-surface readiness scores (one per discovered tool, agent, subagent, or the repo as a whole) and an overall score (a breadth-aware, badness-weighted mean — weak surfaces pull it down harder, but a single poor surface does not zero it; the score is a triage signal, not the CI gate).
- The discovered inventory — tools, agents, hosted tools, MCP servers, subagents, skills, slash commands, plugin manifests, and Claude settings — surfaced at the top level for CI consumers.
The human format honestly separates the three things people commonly conflate:
Tool definitions: 2 (custom tools with function bodies — scored below)
Agent tool grants: 14 (tool names the agent may call — audited by agent-scope rules)
Hosted tools: 1 (...)
Only the "Tool definitions" category flows through tool-scope rules (they have function bodies a rule can read). Agent grants and hosted instances are inputs to agent-scope rules, not unanalyzed — they just don't appear in the per-surface readiness table.
--format human (default) renders a human summary to stdout and live
progress to stderr — an animated spinner and progress bar on an
interactive terminal, or plain [phase] summary lines when piped
(CI-friendly).
--format json marshals the full ScanResult for piping into your
own automation.
--format sarif emits a SARIF 2.1.0 document, suitable for
github/codeql-action/upload-sarif and other SARIF-aware tools. The suggested
fix is carried at the rule level (help.text); Trustabl emits no per-result
fixes[], so the document passes GitHub Code Scanning's schema validator (which
rejects a fix that lacks artifactChanges).
--json-out <file> and --sarif-out <file> write the JSON / SARIF document to a
file independent of --format — one scan can print the human summary to stdout
while persisting both machine artifacts. The file bytes are identical to the
matching --format stdout output.
--bom-out <file> additionally writes a byte-stable CycloneDX 1.5 BOM of the
dependencies the repo declares across every supported language — requirements.txt
/ pyproject.toml / Pipfile (pip), package.json (npm), go.mod (Go),
composer.json (Composer), *.csproj (NuGet), Cargo.toml (Cargo). It is pure
inventory of DECLARED direct deps and makes no network call.
--vuln-scan turns that BOM into a vulnerability verdict: it matches the repo's
concretely-pinned dependencies against a pinned OSV snapshot
and reports each affected package as a finding carrying the advisory ID
(CVE / GHSA / PYSEC / …), a CVSS-derived severity, and the first fixed version —
so a vulnerable dependency fails the scan through the normal severity gate and
exit codes and lands in the JSON / SARIF output alongside the rule findings, on
ScanResult.vulnerabilities. Unlike the rest of a scan it is opt-in and
online: the OSV snapshot is fetched from osv.dev on first use, cached under
your user cache directory, and then cache-first — a later --vuln-scan
reuses the cached database (no re-download) until it is older than 24h, so
repeated scans are fast and offline-capable. trustabl vulndb pull refreshes the
cache on demand; --no-rules-update pins to the cache at any age (fully offline).
Only concretely-pinned versions are matched
— a declared range (^1.0, >=2) can't be resolved to one version without a
lockfile, so it is left unmatched rather than guessed. The snapshot version is
folded into the ScanID only when --vuln-scan is on, so the result is honest
about which vulnerability data produced it while a default scan stays
byte-identical to before.
Combining --vuln-scan with --bom-out upgrades the CycloneDX document from a
plain inventory into a BOM plus VEX: the matched advisories are emitted as a
CycloneDX 1.5 vulnerabilities[] array — each with the advisory ID, an OSV
source, a severity rating, an upgrade recommendation, and an affects[]
reference linking it to the affected component's bom-ref — so a single
trustabl scan ./repo --vuln-scan --bom-out bom.json produces a standards-based
artifact that any CycloneDX-aware tool can ingest. Without --vuln-scan the
vulnerabilities[] array is omitted and the BOM stays pure inventory.
--format json and --format sarif are progress-silent and byte-stable
across identical-input runs (pure functions of the ScanResult). The human
format is not byte-stable by design: its ANSI color is auto-detected from the
terminal (TTY vs pipe, NO_COLOR), so the same scan can render with or without
color. Use --no-color, or diff the JSON/SARIF output, when byte-stability
matters.
--verbose (-v) narrates the scan on stderr: rule provenance (repo, ref,
resolved SHA, and any cache fallback), per-phase discovery counts (languages,
tools, agents, detected SDKs, loaded detectors, unaudited SDKs), output
destinations, and a final result summary (scan ID, score, findings by severity,
exit code). --debug adds everything --verbose shows plus per-phase timing and
capped per-entity / per-finding detail (each discovered tool/agent and each
finding with its file:line).
Both are global flags — they work on scan, mcp, and rules pull, and may
appear before or after the subcommand (trustabl -v scan … or trustabl scan -v …). --debug implies --verbose. Both write only to stderr, so they never
perturb the report on stdout or the JSON/SARIF byte-stability contract:
--format json --debug still emits a clean document on stdout while the
diagnostics stream to stderr. Diagnostic color follows the same rules as the
report (off under --no-color, NO_COLOR, or when stderr is not a terminal).
Because an animated progress panel and interleaved log lines would corrupt each
other on the same stderr, --verbose/--debug automatically render progress as
plain [phase] lines instead of the live spinner.
Saving diagnostics to a file. There is no dedicated --log-file flag —
because diagnostics are a separate stream (stderr), redirecting stderr is the
intended mechanism:
# Report and diagnostics to separate files (stdout vs stderr)
trustabl scan ./repo --debug --format json >report.json 2>diagnostics.log
# Human report on screen, diagnostics to a file
trustabl scan ./repo --debug 2>diagnostics.log
# Everything (report + diagnostics) in one file
trustabl scan ./repo --debug &>everything.logWith --format json/sarif progress is off, so the stderr file is
diagnostics-only; with --format human it also carries the plain [phase]
progress lines.
Exit codes:
0— no findings ≥ medium severity (or no findings at all).1— at least one finding ≥ medium severity, OR--strictwith any finding present.2— scanner / I/O error, OR no usable rules found and none fetchable (runtrustabl rules pull), OR a signed channel (--channel) that failed verification (bad signature, untrusted/expired key, channel confusion, an expired or rolled-back statement, or a digest mismatch) — Trustabl refuses to run unverified rules.
OpenShell surfaces are still discovered (shell-invocation functions,
openshell/*.yaml policies) and reported on a Risk surfaces: openshell
block in the human format: the count of shell-invoking functions, the first
three file:line locations (deterministically sorted), a why: line stating
the threat model (a prompt-injected agent that exposes one of these as a
callable tool can run arbitrary commands), and a fix: line with concrete
remediations (sandbox, allowlist, drop shell=True, keep shell logic out
of agent-callable code). The OSH-* detection rules that audited these
surfaces have moved to a closed-source companion project; with no OSH rules
shipped, such repos fire no rule and no META finding — the block makes
the unaudited risk legible without claiming an audit happened. OpenShell is
a risk surface, not an SDK, so it is not flagged as "unaudited" the way an
unknown SDK would be.
brew install trustabl/tap/trustablscoop bucket add trustabl https://github.com/trustabl/scoop-bucket
scoop install trustabldocker run --rm -v "$PWD:/repo" ghcr.io/trustabl/trustabl:latest scan /repoGrab a prebuilt archive for your platform from the
Releases page. Each release
includes a checksums.txt and a build-provenance attestation; verify with:
gh attestation verify <archive> --repo trustabl/trustablRequires CGO_ENABLED=1 because the AST parsers use tree-sitter
(Python + TypeScript + TSX bindings), which is a C library:
# macOS / Linux
CGO_ENABLED=1 go build -o trustabl ./cmd/trustabl
# Cross-compile: pick a C toolchain for the target. zig is the easiest.
CGO_ENABLED=1 CC="zig cc -target x86_64-linux-gnu" \
GOOS=linux GOARCH=amd64 go build -o trustabl-linux ./cmd/trustablThis is the cost of using tree-sitter for accurate AST parsing. If a
single-binary, no-CGO distribution becomes a hard requirement later, the
swap target is github.com/go-python/gpython for Python (with lower
fidelity on modern Python); TypeScript would need a separate replacement.
# Local repo
trustabl scan ./path/to/agent-repo
# GitHub repo (shallow clone to temp dir, removed on exit)
trustabl scan https://github.com/org/repo
# Restrict detectors
trustabl scan ./repo --detectors claude_sdk
trustabl scan ./repo --detectors openai_sdk
trustabl scan ./repo --detectors google_adk
trustabl scan ./repo --detectors claude_sdk,openai_sdk,google_adk
# --detectors openshell is accepted but selects zero rules (pack is closed-source now)
# Agent Skill security (SKILL.md) — flags unrestricted allowed-tools (a bare
# `Bash` grant), pre-model dynamic-context exec, bundled-script network egress /
# secret reads, committed secrets, hidden-Unicode prompt injection, and a
# description that claims read-only while granting side-effecting tools (the
# CSKILL-* rules). Skills are discovered and scanned automatically.
trustabl scan ./repo # scans skills alongside tools/agents/MCP
trustabl scan ./repo --detectors claude_skill # only the Agent Skill (CSKILL-*) rules
trustabl scan ./path/to/my-skill # point straight at one skill's directory
# Dependency BOM (supply chain): export the repo's DECLARED deps across all
# supported languages (pip / npm / Go / Composer / NuGet / Cargo manifests) as a
# CycloneDX SBOM, to hand to OSV-Scanner / Dependabot / syft. Pure inventory —
# the scan itself does no CVE lookup.
trustabl scan ./repo --bom-out sbom.json
# Vulnerability scan (opt-in, online): match the repo's pinned deps against the
# OSV database and FAIL on known CVEs — advisory id, CVSS severity, fixed version.
trustabl vulndb pull # pre-download OSV (optional; --vuln-scan auto-fetches)
trustabl scan ./repo --vuln-scan # BOM inventory + CVE verdict in one pass
trustabl scan ./repo --vuln-scan --bom-out bom.json # CycloneDX BOM + VEX (vulnerabilities[]) in one file
# JSON output for CI piping
trustabl scan ./repo --format json
# SARIF output for GitHub Code Scanning / SARIF-aware tools
trustabl scan ./repo --format sarif > trustabl.sarif
# Write the report to a file instead of stdout (any format). --output writes
# the file even when the scan exits 1 on findings, so a CI step can upload it.
trustabl scan ./repo --format sarif --output trustabl.sarif
# One scan, both machine artifacts written to files (human summary to stdout)
trustabl scan ./repo --json-out trustabl.json --sarif-out trustabl.sarif
# Exit 1 on any finding regardless of severity
trustabl scan ./repo --strict
# Download / refresh the detection rule packs into the local cache
trustabl rules pull
# Validate a local rule-pack directory against this build's schema (CI gate
# for the trustabl-rules repo — strict-loads every pack, fails on the first error)
trustabl rules validate ./trustabl-rules
# Use a custom rules repo, or pin a specific released ruleset (env: TRUSTABL_RULES_REPO).
# Default pulls the latest reviewed rules from trustabl-rules main; pin a tag for stability.
trustabl scan ./repo --rules-repo https://github.com/org/my-rules
trustabl scan ./repo --rules-ref v0.1.0
# Air-gapped / offline: skip the network fetch, use the cached rules only
trustabl scan ./repo --no-rules-update
# Progress output (human format): animated on a terminal, plain lines when piped
trustabl scan ./repo # spinner + bars on a TTY; "[phase] summary" lines when piped
trustabl scan ./repo --no-progress # disable progress entirely
# Diagnostics on stderr (global flags; stdout/report unaffected)
trustabl scan ./repo --verbose # -v: rule provenance, discovery counts, result summary
trustabl scan ./repo --debug # + per-phase timing and per-entity/per-finding detail
trustabl scan ./repo --debug --format json > out.json # clean JSON on stdout, diagnostics on stderr
# Run as a stdio MCP server so an MCP client (Claude Code, Cursor, Claude
# Desktop) can scan code an agent just wrote (see "Run as an MCP server" below)
trustabl mcp
# Configure LLM provider, then enrich a scan result with AI explanations and fixes
trustabl llm list # show configured providers with masked keys
trustabl llm key set # prompt securely for an API key
trustabl llm key set sk-ant-api03-... # set key non-interactively
trustabl llm key get # show masked key for active provider
trustabl llm key delete # delete key with confirmation prompt
trustabl llm model set claude-sonnet-4-6 # change model for active provider
trustabl llm provider set openai # switch active provider (auto-creates entry)
trustabl llm provider list # list configured providers
# Enrich a scan result (requires anthropic provider with a key set)
trustabl scan ./myrepo --format json | trustabl enrich --repo ./myrepo # pipe scan into enrich (stdout)
trustabl enrich --input scan.json --repo ./myrepo --output enriched.json # file in, file out
trustabl enrich --input scan.json --repo ./myrepo --diff # preview proposed fixes as a unified diff (stderr)
trustabl enrich --input scan.json --repo ./myrepo --diff --apply # preview and apply fixes
trustabl enrich --input scan.json --repo ./myrepo --apply # apply fixes without previewing
trustabl enrich --input scan.json --repo ./myrepo --rule CSDK-010 # focus on one rule
trustabl enrich --input scan.json --repo ./myrepo --only-enriched # CI: only enriched findingsRules are cached under your OS cache dir (os.UserCacheDir(), e.g.
%LocalAppData%\trustabl\rules\ on Windows, ~/.cache/trustabl/rules/
on Linux). The first scan (or an explicit trustabl rules pull)
populates it; each subsequent scan checks for an update first (unless
--no-rules-update), falling back to the cached rules if the fetch
fails.
Signed rules channels (opt-in). --channel <name> resolves rules from a
signature-verified release channel instead of cloning git: Trustabl verifies a
signed channel statement against an embedded trust keyring, fetches the bundle
it commits to, and refuses (exit 2) on any verification failure rather than
running unverified rules. The default scan is unchanged — it still uses the
git source. A scan that did not use blessed production rules (a pre-release
--channel, or a custom --rules-repo source) is watermarked in the report and
in the JSON rules_origin field, and its provenance is folded into ScanID.
Signed channels require a build with published signing keys; until then
--channel refuses with a clear message.
Two CI patterns are supported, and they compose:
- Gate the build. The exit code is the gate:
0clean,1on a finding of medium severity or higher (--strictlowers the bar to any finding),2on an operational error. A baretrustabl scan ./repoin a job step fails the job when it should. - Publish to GitHub Code Scanning. On GitHub Actions, the
trustabl/trustabl-actionruns the scan and uploads the SARIF to the repository's Security tab in a single step (upload-sarifdefaults totrue), with inline PR alerts and optional threshold gating — the recommended path, and the single source of truth for the workflow. Outside GitHub Actions,--format sarif --output <file>writes a SARIF 2.1.0 report that anygithub/codeql-action/upload-sarifor SARIF-aware step can publish. Because--outputwrites the file before the findings-based exit code is applied, the scan step can run withcontinue-on-error: trueand the upload withif: always(), so a scan that finds issues still surfaces them instead of aborting the run with nothing uploaded.
The SARIF document is a pure function of the scan result: byte-stable across
identical-input runs, repo-relative file URIs, and a stable
partialFingerprints per finding so Code Scanning deduplicates alerts across
runs rather than re-opening them.
trustabl mcp runs a Model Context Protocol (MCP) server over stdio, so an MCP
client (Claude Code, Cursor, Claude Desktop) can scan a directory an agent just
edited and read the findings back. It is the same scan as trustabl scan,
exposed as an MCP tool — it opens no network port. The server speaks JSON-RPC on
stdout, so it writes nothing else there; status lines and diagnostics go to
stderr.
It exposes two tools:
scan— input{ "path": "<dir>", "rules_ref": "<branch-or-tag>"?, "vuln_scan": true? }. Scanspathand returns the full scan result (findings, scores, discovered inventory) as JSON — the same shape as--format json. Setvuln_scan: trueto also match declared dependencies against a pinned OSV snapshot and report known CVEs (mirrors--vuln-scan; off by default).version— reports the build version, commit, and date.
Register it with an MCP client by pointing the client at the binary with the
mcp argument over stdio. For Claude Code:
claude mcp add trustabl -- trustabl mcpOr configure it directly in a client's MCP config (the mcpServers stdio
shape used by Claude Desktop / Cursor):
{
"mcpServers": {
"trustabl": {
"command": "trustabl",
"args": ["mcp"]
}
}
}The rules-source flags (--rules-repo, --rules-ref, --no-rules-update) work
on trustabl mcp exactly as on trustabl scan; a client may also pass a
per-call rules_ref in the scan tool arguments, which overrides the
command-level --rules-ref for that call.
Trustabl ships a Claude Code plugin under .claude-plugin/
with two skills that form a scan-and-fix loop:
trustabl-scan— triggers right after agent, tool, subagent, or MCP-server code is written or changed and calls Trustabl'sscantool (the bundled MCP server,mcp__trustabl__scan) to self-audit it before committing, upstream of CI.trustabl-enrich— takes the output of atrustabl scanrun (JSON, SARIF, or pasted terminal text) and applies each finding as a targeted code edit, guided entirely by the scan's ownexplanationandsuggested_fixfields. It does not re-run the scanner; usetrustabl-scanfirst, then invoketrustabl-enrichwith the results.
Scanning runs through a bundled MCP server: .mcp.json
registers a trustabl server whose command is a launcher
(scripts/trustabl-mcp.sh) exposing the
mcp__trustabl__scan tool. The launcher and a SessionStart hook
(hooks/hooks.json →
scripts/check-trustabl.sh) share install logic
(scripts/lib-trustabl.sh) that downloads the pinned
CLI version from GitHub Releases, verifies it against the release
checksums.txt, and installs it into the plugin's private data directory
($CLAUDE_PLUGIN_DATA — no sudo, nothing outside that dir touched). The
install is idempotent (re-runs only when the pin changes or the copy is
missing/corrupt), and the launcher installs synchronously before starting the
server, so there is no first-session race. The same binary is exposed as
$TRUSTABL_BIN for the enrich skill's direct-CLI path. When auto-install cannot
run (offline first session, an unsupported platform, or missing curl/tar) it
falls back to whatever trustabl is on PATH; the system-wide install stays a
consented step inside trustabl-scan. The plugin runs no network service of its
own and modifies nothing outside the scan target.
Trustabl loads rules forward-compatibly. If the resolved pack targets a
newer rule-schema version than your binary understands, the scan still runs: it
evaluates every rule your build can understand and skips the rest, warning
on stderr (and recording the skipped rule IDs on ScanResult.RulesSkipped):
warning: the rules target schema version 9 but this Trustabl build supports up to 8; 2 rule(s) newer than this build were skipped. Upgrade Trustabl to evaluate them.
The same per-rule skip happens for any individual rule that references a
scope, an applies_to value, or a predicate your build does not understand —
that rule is dropped while its siblings still run, so a newer rules release never
forces a lockstep binary upgrade. Every skip is also surfaced in the report
itself as a single META-005 info finding ("N rules require a newer Trustabl
engine"), so a degraded scan is never mistaken for a clean one. A malformed
rule your build does understand (a missing field, a bad value) is not
forward-skipped — it still hard-fails the load, so real authoring errors are
never silently dropped.
To evaluate the skipped rules, upgrade Trustabl to a build whose
SupportedSchemaVersion (see internal/rules/schema_version.go) covers the
pack. No action is needed if you're comfortable running the subset.
The scan only fails (exit 2) when nothing usable remains:
-
"all rules require a newer engine schema" — every rule is too new for your build, so there is nothing to run. Upgrade Trustabl, or pin an older rules branch/tag your build fully understands (
--rules-refresolves branches and tags only, not raw commit SHAs, so a compatible ref must already exist):trustabl scan ./repo --rules-ref <branch-or-tag>
-
"no usable rules manifest" — the pack's
manifest.yamlis missing, unparseable, or declares a non-positive version (a corrupt/truncated pack). Runtrustabl rules pullto refresh. -
"no usable rules found" — nothing cached and nothing fetchable (offline with a cold cache). Run
trustabl rules pullwhile online.
| Pipeline node | Code path |
|---|---|
| Importer | internal/ingestion/importer.go |
| Normalizer (recon) | internal/ingestion/normalizer.go |
| Discovery (Python AST + markdown/JSON) | internal/analysis/discovery.go, agents.go, hosted_tools.go, mcp_servers.go, adk_agents.go (Python AST); subagents.go, markdown_agents.go, skills.go, slash_commands.go (markdown frontmatter); plugins.go, claude_settings.go (JSON) |
| TypeScript discovery | internal/analysis/ts_discovery.go, ts_agents.go, ts_mcp_servers.go, ts_handler_facts.go, ts_openai_tools.go, ts_openai_agents.go, ts_openai_hosted_tools.go, ts_openai_mcp_servers.go, ts_openai_guardrails.go, ts_openai_sessions.go, ts_adk_tools.go, ts_adk_agents.go, ts_adk_hosted_tools.go, astutil/ts.go |
| Detector runtime | internal/analysis/detectors/ |
| Rule source | internal/rulesource/ (git fetch + cache + schema-version gate) |
| Detector rules | external trustabl-rules repo (tests: testdata/rules-fixture/) |
| Rule engine | internal/rules/{schema,loader,evaluator,predicates,rule_detector}.go |
| Scoring engine | internal/analysis/scoring.go |
| Report renderer | internal/review/diff.go (human), internal/sarif/render.go (SARIF), JSON marshal in cmd/trustabl |
| LLM config | internal/llm/ (key storage · masking · validation) |
Rule packs live in the separate trustabl-rules git repository (grouped
{claude_sdk,openai_sdk,google_adk,mcp}/), resolved at scan time rather
than embedded in the binary. Naming convention: CSDK-NNN for Claude
Agent SDK rules (CSDK-0xx tool-scope, CSDK-1xx agent + subagent-scope),
OAI-NNN for OpenAI Agents SDK rules, ADK-NNN for Google ADK rules,
MCP-NNN for the dedicated MCP tool-scope pack.
See
ARCHITECTURE.md § 2 — steps 3–4 for the
shipped rule table and COVERAGE.md for per-SDK
recognition detail.
testdata/corpus/ holds real-world agent code (Claude SDK demos, OpenAI Agents
SDK demos, Google ADK demos, a TS Claude SDK fixture) — a corpus, not a
controlled fixture, so well-written agents won't trigger most rules and
that's correct. See testdata/corpus/PROVENANCE.md
for upstream sources and licenses of each example. Per-rule fire/silent
correctness lives in internal/rules/policies_test.go; the end-to-end
sweep in internal/scanner/scanner_test.go only asserts the scanner
doesn't crash on real-world inputs. A labelled 20–40 real-agent-repo
corpus is the detection-quality target (see
ARCHITECTURE.md § 10);
the current tests are regression coverage, not detection-quality
measurement.
Join the Trustabl Discord to ask questions, share feedback, and follow development.
Apache-2.0. See LICENSE.
