Skip to content

Add dump --format=json for IR output#442

Draft
guillaume86 wants to merge 1 commit into
pgplex:mainfrom
guillaume86:probe/ir-dump-json
Draft

Add dump --format=json for IR output#442
guillaume86 wants to merge 1 commit into
pgplex:mainfrom
guillaume86:probe/ir-dump-json

Conversation

@guillaume86
Copy link
Copy Markdown

@guillaume86 guillaume86 commented May 23, 2026

Summary

  • Adds a --format flag to pgschema dump. --format=sql (default) keeps the current behavior; --format=json writes the IR pretty-printed (2-space indent) to stdout.
  • The IR pgschema builds internally is already fully JSON-tagged on its public fields, so this is essentially publishing what was already there. The unexported sync.RWMutex fields are ignored by encoding/json automatically.
  • --multi-file is rejected with --format=json (the IR is a single document); unknown formats produce a clear error.

Motivation

Downstream tooling — LSPs, diff visualizers, lint engines, CI gates, IR-aware test harnesses — currently has to re-parse pgschema's SQL output to recover the model. The internal IR is the natural source of truth, and exposing it directly:

  • Avoids round-tripping through SQL just to extract structure.
  • Locks in a stable surface for tools that want to build on top of pgschema's catalog reflection.
  • Costs almost nothing internally (one flag, one branch in ExecuteDump).

Test plan

  • TestIRJSONRoundTrip (unit, no DB) — marshal a constructed IR, unmarshal, marshal again; assert byte-for-byte equality. Proves round-trip stability.
  • TestExecuteDump_FormatValidation (unit) — unknown format and json + --multi-file produce clear errors.
  • TestDumpCommand_FormatJSON (integration, PG 17) — apply real DDL to embedded PG, run dump --format=json, parse the output, assert schema/table/column/index shape, and check the round-trip is stable.

Run just the new tests:

go test -v ./cmd/dump -run 'TestIRJSONRoundTrip|TestExecuteDump_FormatValidation|TestDumpCommand_FormatJSON'

Full suite green locally (go test ./... ~7 min on PG 17).

Possible follow-ups (not in this PR, happy to discuss)

  • A model_version field on the JSON envelope so future schema-changes to the IR can be detected by consumers (mirrors what Plan already does with PgschemaVersion).
  • A LoadIRFromJSON helper on ir for symmetric consumption — currently consumers can json.Unmarshal(bytes, &ir.IR{}) directly, which works but a typed loader is friendlier.
  • A dump --format=json JSON Schema published alongside.

Kept this PR minimal — each of the above can be its own conversation.

🤖 Generated with Claude Code


Note (2026-05-23): pgproj (a downstream DACFX-shaped fork at https://github.com/guillaume86/pgproj) has adopted a hard-fork posture and will continue independently. This contribution remains open without expectation of merge; happy to address review feedback if a maintainer engages.

`pgschema dump` previously only emitted SQL. The IR it builds
internally is already fully JSON-tagged on the public fields, so
this exposes it directly via a new `--format` flag (default `sql`,
new `json`).

Motivation: downstream tooling (LSPs, diff visualizers, lint
engines, CI gates) wants to consume the model without re-parsing
the SQL dump. The internal IR was already the natural source of
truth; this is just publishing it.

The JSON output is pretty-printed (2-space indent) and stable
under round-trip marshal/unmarshal because the IR uses sorted
map keys and unexported sync.RWMutex fields are ignored by
encoding/json.

`--multi-file` is rejected with `--format=json` since the IR is a
single document; unknown formats produce a clear error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant