Skip to content

feat(engine): project mode prompts per request#2687

Closed
LeoAlex0 wants to merge 1 commit into
Hmbown:mainfrom
LeoAlex0:feat/cache-append-only
Closed

feat(engine): project mode prompts per request#2687
LeoAlex0 wants to merge 1 commit into
Hmbown:mainfrom
LeoAlex0:feat/cache-append-only

Conversation

@LeoAlex0
Copy link
Copy Markdown
Contributor

@LeoAlex0 LeoAlex0 commented Jun 3, 2026

Summary:
Keep the base system prompt mode-agnostic and byte-stable. Mode instructions, tool taxonomy, and approval policy are now projected as transient request-time runtime metadata at the tail of the message list instead of mutating message[0] or persisting extra system messages in history.

Scope:

  • Remove mode, approval, and generated tool taxonomy from the stable prompt composition while keeping personality in message[0].
  • Project current mode and approval policy per request so mode changes do not rewrite stored session history.
  • Keep strict chat-template providers compatible by avoiding appended non-leading system messages.
  • Preserve existing MCP deferral behavior; only native tool deferral was simplified where mode was unused.
  • Respect system_prompt_override only when callers explicitly request a persisted host-supplied prefix.

Not in this slice:

  • Changing dynamic MCP tool deferral semantics.
  • Preserving prefix cache across model changes.

Validation:

  • nix develop -c cargo fmt --check
  • nix develop -c cargo test -p codewhale-tui runtime_prompt_is_projected_without_persisting_to_session_messages -- --nocapture
  • nix develop -c cargo test -p codewhale-tui cache_inspect_displays_tool_result_budget_metadata -- --nocapture
  • nix develop -c cargo test -p codewhale-tui mcp::tests::legacy_sse_closed_stream_reconnects_and_retries_tool_call -- --nocapture
  • nix build

Note:
A full local nix develop -c cargo test -p codewhale-tui -- --nocapture run saw one transient failure in mcp::tests::legacy_sse_closed_stream_reconnects_and_retries_tool_call; the same test passed when rerun directly.

Greptile Summary

This PR refactors the engine's prompt architecture so message[0] is completely mode-agnostic: mode instructions, approval policy, and tool taxonomy are stripped from the stable system prompt and instead injected as a transient user-role <runtime_prompt> message appended at request time by messages_with_turn_metadata(). Personality (Calm/Playful) remains in message[0].

  • prompts.rs: Removes mode/approval/taxonomy from the compose_prompt parts array, keeping only base prompt + personality; functions made pub(crate) for engine use.
  • engine.rs: refresh_system_prompt() drops the mode parameter; new runtime_prompt_message() / runtime_prompt_text() project mode+approval at request time; approval_mode_for() / agent_approval_mode_for_turn() helpers added.
  • capacity_flow.rs / turn_loop.rs / tool_catalog.rs: Dead mode parameters removed from internal functions.

Confidence Score: 5/5

Safe to merge. Mode and approval are always re-injected fresh at each API request boundary via the transient runtime prompt, so no compaction path can drop them from the model's view.

The runtime-prompt projection approach is sound and well-tested. The new tests directly verify the key invariants. The sole finding is a loosened test bound in the error escalation test that does not indicate a production defect.

crates/tui/src/core/engine/tests.rs — one loosened message-count bound in the error escalation test.

Important Files Changed

Filename Overview
crates/tui/src/core/engine.rs Core architectural change: refresh_system_prompt() drops mode param; new runtime_prompt_message() and helpers project mode+approval at request time via a transient user-role message; messages_with_turn_metadata() now appends the runtime prompt on every call.
crates/tui/src/prompts.rs Mode/approval/taxonomy removed from compose_prompt parts array; stable prompt now contains only base + personality; functions made pub(crate); tests updated to assert mode/approval are absent from the base prompt.
crates/tui/src/core/engine/tests.rs Tests updated for the new runtime-prompt architecture; new invariant tests added; error_escalation message-count bound loosened from 2 to 4 without clear justification.
crates/tui/src/core/engine/capacity_flow.rs Dead mode: AppMode parameters removed from replan and replay functions; refresh_system_prompt() calls updated.
crates/tui/src/core/engine/turn_loop.rs refresh_system_prompt() and checkpoint calls drop the mode argument; messages_with_turn_metadata() updated to append the runtime prompt.
crates/tui/src/core/engine/tool_catalog.rs Dead _mode: AppMode parameter removed from should_default_defer_tool and apply_native_tool_deferral; mode still used in ensure_advanced_tooling for Plan-mode gating.
crates/tui/src/core/ops.rs Doc comment on Op::SetModel updated; no functional change.
crates/tui/src/core/session.rs Doc comment on system_prompt_override updated; no functional change.
crates/tui/src/commands/debug.rs Cache-inspect test assertions updated to count-based checks since the runtime prompt is now an additional message in the inspect output.

Fix All in Codex Fix All in Claude Code Fix All in Cursor

Reviews (12): Last reviewed commit: "feat(cache): project mode prompts per re..." | Re-trigger Greptile

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

Thanks @LeoAlex0 for taking the time to contribute.

This repository is currently observing a maintainer-managed contribution gate in dry-run mode, so this pull request is staying open. When enforcement is enabled, pull requests from contributors who are not listed in .github/APPROVED_CONTRIBUTORS will be closed automatically.

Please read CONTRIBUTING.md for the expected contribution shape. A maintainer can grant PR access by commenting /lgtm on a pull request.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors system prompt handling in the TUI engine to improve prefix-cache stability (e.g., for DeepSeek) by delivering mode contracts as append-only system messages rather than inlining them into the initial system prompt. The review feedback highlights a critical bug where the system_prompt_override flag from Op::SyncSession is unconditionally ignored, which prevents synced sessions from refreshing dynamic context when intended. Additionally, the feedback suggests updating a test to properly verify prompt preservation and notes a design issue where using AppMode::Agent as the stable baseline results in duplicate or conflicting instructions for other modes.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread crates/tui/src/core/engine.rs Outdated
Comment thread crates/tui/src/core/engine/tests.rs Outdated
Comment thread crates/tui/src/core/engine.rs Outdated
Comment thread crates/tui/src/core/engine.rs Outdated
@LeoAlex0 LeoAlex0 force-pushed the feat/cache-append-only branch from e26e8fd to dd1d16b Compare June 3, 2026 19:06
Comment thread crates/tui/src/core/engine/turn_loop.rs
@LeoAlex0 LeoAlex0 marked this pull request as ready for review June 3, 2026 19:16
@LeoAlex0 LeoAlex0 marked this pull request as draft June 3, 2026 19:20
@LeoAlex0 LeoAlex0 force-pushed the feat/cache-append-only branch from dd1d16b to 941d931 Compare June 3, 2026 19:21
@LeoAlex0 LeoAlex0 marked this pull request as ready for review June 3, 2026 19:30
@LeoAlex0 LeoAlex0 marked this pull request as draft June 3, 2026 19:48
@LeoAlex0 LeoAlex0 force-pushed the feat/cache-append-only branch 6 times, most recently from 832ec40 to 207f670 Compare June 3, 2026 20:33
@LeoAlex0 LeoAlex0 marked this pull request as ready for review June 3, 2026 20:36
@LeoAlex0 LeoAlex0 changed the title feat(cache): append mode prompts without rewriting prefix feat(engine): mode-agnostic system prompt with append-only mode/approval messages Jun 3, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 3, 2026

Want your agent to iterate on Greptile's feedback? Try greploops.

Comment thread crates/tui/src/core/engine.rs Outdated
@LeoAlex0 LeoAlex0 force-pushed the feat/cache-append-only branch from 207f670 to 0d280fb Compare June 3, 2026 20:47
@LeoAlex0 LeoAlex0 marked this pull request as draft June 3, 2026 20:52
@LeoAlex0 LeoAlex0 force-pushed the feat/cache-append-only branch from 0d280fb to 37d9525 Compare June 3, 2026 20:52
@LeoAlex0 LeoAlex0 force-pushed the feat/cache-append-only branch 6 times, most recently from 8b61127 to 3a3c289 Compare June 4, 2026 03:00
@LeoAlex0 LeoAlex0 force-pushed the feat/cache-append-only branch from 3a3c289 to 32dd162 Compare June 4, 2026 03:02
@LeoAlex0 LeoAlex0 marked this pull request as ready for review June 4, 2026 03:03
@LeoAlex0 LeoAlex0 force-pushed the feat/cache-append-only branch 2 times, most recently from ca5607e to 6fd087e Compare June 4, 2026 03:18
Comment thread crates/tui/src/prompts.rs Outdated
@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented Jun 4, 2026

very smart - thank you. will work on getting this harvested at the very least :) @LeoAlex0

@LeoAlex0 LeoAlex0 force-pushed the feat/cache-append-only branch 5 times, most recently from 9bb15e2 to 38e6d52 Compare June 4, 2026 19:21
Comment thread crates/tui/src/tui/ui.rs
Comment on lines 5995 to 6001
session_id: app.current_session_id.clone(),
messages: app.api_messages.clone(),
system_prompt: app.system_prompt.clone(),
system_prompt_override: false,
system_prompt_override: true,
model: app.model.clone(),
workspace: workspace.clone(),
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Workspace switch freezes old workspace's skills and project instructions

switch_workspace shuts down the old engine, spawns a fresh engine (which correctly builds its initial prompt from the new workspace config), and then immediately overrides that fresh prompt by sending SyncSession with app.system_prompt (the old workspace's prompt) and system_prompt_override: true. Because apply_workspace_runtime_state never clears app.system_prompt, the frozen value is the old workspace's text. The new engine then sets session.system_prompt_override = true, so every subsequent refresh_system_prompt() call early-returns without regenerating from the new workspace's skills directory or project instructions. Skills and .codewhale/instructions from the new workspace will never appear in message[0] for the rest of that engine's lifetime.

Fix in Codex Fix in Claude Code Fix in Cursor

@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented Jun 5, 2026

Very useful direction, thank you @LeoAlex0. I don’t want to direct-merge this yet because appended system messages collide with strict chat-template providers, and runtime prompt override semantics need a narrower design. Keeping this open for a focused prompt-stability harvest.

Keep the stable system prompt mode-agnostic and project the current mode and approval policy as request-time runtime metadata. This avoids mutating stored history while preserving provider chat-template compatibility.

Also relax the cache inspect metadata test so nix builds do not depend on process-local spillover dedup state.
@LeoAlex0 LeoAlex0 force-pushed the feat/cache-append-only branch from 38e6d52 to 7794330 Compare June 5, 2026 03:00
@LeoAlex0 LeoAlex0 changed the title feat(engine): mode-agnostic system prompt with append-only mode/approval messages feat(engine): project mode prompts per request Jun 5, 2026
@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented Jun 5, 2026

Thanks @LeoAlex0. We harvested the safe v0.9 runtime-prompt metadata slice in #2801 with your GitHub-mappable authorship preserved and merged it into the stewardship branch at fbe8d9ee5d0e83f1e95196c66659a43b91d2ac75.

The landed version keeps stable system prompts mode-agnostic, projects mode / approval / tool-taxonomy metadata per request as user-role runtime metadata, preserves the stewardship turn-metadata cache tests, keeps the replan replay guard at <= 2, and tightens cache-inspect assertions for both deduplicated=false and deduplicated=true.

#2762 is green at fbe8d9ee5d0e83f1e95196c66659a43b91d2ac75, which includes this harvest. Closing this source PR as harvested/superseded. #2722 remains open as the broader v0.9 PR-harvest tracker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants