Skip to content

feat: add ccs prune to losslessly shrink bloated conversations#14

Merged
brtkwr merged 1 commit into
mainfrom
feat/prune-cli
Jun 22, 2026
Merged

feat: add ccs prune to losslessly shrink bloated conversations#14
brtkwr merged 1 commit into
mainfrom
feat/prune-cli

Conversation

@brtkwr

@brtkwr brtkwr commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

What

A ccs prune subcommand that shrinks large conversation files by removing only data that duplicates content kept elsewhere, so pruned conversations still resume with their full dialogue.

Investigating real files showed the referenced pruning blog post targets a message shape that no longer exists (no agent_progress/bash_progress lines), and that the genuinely-duplicated data is different:

  • toolUseResult - byte-for-byte a copy of the tool_result already inside message.content (verified: 189,154 vs 189,156 bytes on the biggest tool line). The model resumes from message.content, so dropping this loses nothing it reads.
  • file-history-snapshot lines - rewind/checkpoint backups, redundant across snapshots and with the files on disk. Pruning loses rewind history, not the conversation.

Across the 8 largest files these are ~25% of bytes; everything else (assistant/user messages, attachments) is real content and left untouched.

Safety

  • Dry run by default - ccs prune only previews; --apply is required to rewrite anything.
  • User and assistant message lines are never modified or dropped.
  • Each file is streamed to <path>.pruned and atomically renamed only if the conversation line count is unchanged - a failed or partial prune never clobbers the original.
  • Default skips files under 50MB; --apply confirms before writing (-y to skip).

CLI

ccs prune                       # dry-run preview (files >= 50MB), no changes
ccs prune --apply               # prune after confirmation
ccs prune --apply --min-size=200  # only files >= 200MB
ccs prune --apply --no-tool-results  # keep tool results, drop only snapshots
ccs prune --apply -y            # skip the prompt

Read-only default run on real data:

  ...infra/4540c603-....jsonl     75MB ->     53MB  (-21MB)
  ...auth-gateway/b8d9d564-....jsonl  61MB ->  44MB  (-16MB)
  ...
Would reclaim 69MB across 5 files. Re-run with --apply to prune.

Tests

  • TestPruneStreamRemovesDuplicatesKeepsDialogue - snapshot dropped, toolUseResult stripped, message.content/dialogue preserved, conv-line count invariant, output stays valid JSON.
  • TestPruneOptsRespected - --no-snapshots / --no-tool-results honoured.
  • TestPruneFileReplacesAndShrinks - atomic replace, file shrinks, .pruned temp cleaned up, integrity preserved.
  • TestPruneFileDryRunLeavesFileUnchanged - dry run reports savings but doesn't touch the file.
  • go test -cover = 63.9% (uncovered part is the runPrune CLI driver; the data-safety core is covered).

Note

CLI half only. The TUI per-conversation prune action (a key + confirmation, like Ctrl+D delete) is a planned follow-up. Resume-safety of dropping toolUseResult is provable for the model's view but not for Claude Code's own UI/rewind, so try --apply on a low-risk conversation first.

Conversation JSONL files grow large over time. `ccs prune` rewrites them
removing only data that duplicates content kept elsewhere, so pruned
conversations still resume with full dialogue:

- toolUseResult fields (a copy of the tool_result already in
  message.content)
- file-history-snapshot lines (rewind/checkpoint backups; pruning loses
  rewind history, not the conversation)

User and assistant messages are never modified. Each file is rewritten
to <path>.pruned and atomically renamed only if its conversation line
count is unchanged, so a failed/partial prune never clobbers the
original.

Dry run by default - `ccs prune` only previews savings; pass --apply to
actually rewrite. Dry-run on real files reclaims ~25% (e.g. a 75MB
conversation -> 53MB) with no dialogue loss.

CLI: `ccs prune [--apply] [--min-size=N] [--no-tool-results]
[--no-snapshots] [-y]`. Tests cover the stream transform, per-category
options, atomic file replace + integrity check, and dry-run leaving
files untouched.
@brtkwr brtkwr merged commit bc5e47b into main Jun 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant