feat(engine): project mode prompts per request#2687
Conversation
|
Thanks @LeoAlex0 for taking the time to contribute. This repository is currently observing a maintainer-managed contribution gate in dry-run mode, so this pull request is staying open. When enforcement is enabled, pull requests from contributors who are not listed in Please read |
There was a problem hiding this comment.
Code Review
This pull request refactors system prompt handling in the TUI engine to improve prefix-cache stability (e.g., for DeepSeek) by delivering mode contracts as append-only system messages rather than inlining them into the initial system prompt. The review feedback highlights a critical bug where the system_prompt_override flag from Op::SyncSession is unconditionally ignored, which prevents synced sessions from refreshing dynamic context when intended. Additionally, the feedback suggests updating a test to properly verify prompt preservation and notes a design issue where using AppMode::Agent as the stable baseline results in duplicate or conflicting instructions for other modes.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
e26e8fd to
dd1d16b
Compare
dd1d16b to
941d931
Compare
832ec40 to
207f670
Compare
|
Want your agent to iterate on Greptile's feedback? Try greploops. |
207f670 to
0d280fb
Compare
0d280fb to
37d9525
Compare
8b61127 to
3a3c289
Compare
3a3c289 to
32dd162
Compare
ca5607e to
6fd087e
Compare
|
very smart - thank you. will work on getting this harvested at the very least :) @LeoAlex0 |
9bb15e2 to
38e6d52
Compare
| session_id: app.current_session_id.clone(), | ||
| messages: app.api_messages.clone(), | ||
| system_prompt: app.system_prompt.clone(), | ||
| system_prompt_override: false, | ||
| system_prompt_override: true, | ||
| model: app.model.clone(), | ||
| workspace: workspace.clone(), | ||
| }) |
There was a problem hiding this comment.
Workspace switch freezes old workspace's skills and project instructions
switch_workspace shuts down the old engine, spawns a fresh engine (which correctly builds its initial prompt from the new workspace config), and then immediately overrides that fresh prompt by sending SyncSession with app.system_prompt (the old workspace's prompt) and system_prompt_override: true. Because apply_workspace_runtime_state never clears app.system_prompt, the frozen value is the old workspace's text. The new engine then sets session.system_prompt_override = true, so every subsequent refresh_system_prompt() call early-returns without regenerating from the new workspace's skills directory or project instructions. Skills and .codewhale/instructions from the new workspace will never appear in message[0] for the rest of that engine's lifetime.
|
Very useful direction, thank you @LeoAlex0. I don’t want to direct-merge this yet because appended system messages collide with strict chat-template providers, and runtime prompt override semantics need a narrower design. Keeping this open for a focused prompt-stability harvest. |
Keep the stable system prompt mode-agnostic and project the current mode and approval policy as request-time runtime metadata. This avoids mutating stored history while preserving provider chat-template compatibility. Also relax the cache inspect metadata test so nix builds do not depend on process-local spillover dedup state.
38e6d52 to
7794330
Compare
|
Thanks @LeoAlex0. We harvested the safe v0.9 runtime-prompt metadata slice in #2801 with your GitHub-mappable authorship preserved and merged it into the stewardship branch at The landed version keeps stable system prompts mode-agnostic, projects mode / approval / tool-taxonomy metadata per request as user-role runtime metadata, preserves the stewardship turn-metadata cache tests, keeps the replan replay guard at #2762 is green at |
Summary:
Keep the base system prompt mode-agnostic and byte-stable. Mode instructions, tool taxonomy, and approval policy are now projected as transient request-time runtime metadata at the tail of the message list instead of mutating
message[0]or persisting extra system messages in history.Scope:
message[0].system_prompt_overrideonly when callers explicitly request a persisted host-supplied prefix.Not in this slice:
Validation:
nix develop -c cargo fmt --checknix develop -c cargo test -p codewhale-tui runtime_prompt_is_projected_without_persisting_to_session_messages -- --nocapturenix develop -c cargo test -p codewhale-tui cache_inspect_displays_tool_result_budget_metadata -- --nocapturenix develop -c cargo test -p codewhale-tui mcp::tests::legacy_sse_closed_stream_reconnects_and_retries_tool_call -- --nocapturenix buildNote:
A full local
nix develop -c cargo test -p codewhale-tui -- --nocapturerun saw one transient failure inmcp::tests::legacy_sse_closed_stream_reconnects_and_retries_tool_call; the same test passed when rerun directly.Greptile Summary
This PR refactors the engine's prompt architecture so
message[0]is completely mode-agnostic: mode instructions, approval policy, and tool taxonomy are stripped from the stable system prompt and instead injected as a transient user-role<runtime_prompt>message appended at request time bymessages_with_turn_metadata(). Personality (Calm/Playful) remains inmessage[0].prompts.rs: Removes mode/approval/taxonomy from thecompose_promptparts array, keeping only base prompt + personality; functions madepub(crate)for engine use.engine.rs:refresh_system_prompt()drops themodeparameter; newruntime_prompt_message()/runtime_prompt_text()project mode+approval at request time;approval_mode_for()/agent_approval_mode_for_turn()helpers added.capacity_flow.rs/turn_loop.rs/tool_catalog.rs: Deadmodeparameters removed from internal functions.Confidence Score: 5/5
Safe to merge. Mode and approval are always re-injected fresh at each API request boundary via the transient runtime prompt, so no compaction path can drop them from the model's view.
The runtime-prompt projection approach is sound and well-tested. The new tests directly verify the key invariants. The sole finding is a loosened test bound in the error escalation test that does not indicate a production defect.
crates/tui/src/core/engine/tests.rs — one loosened message-count bound in the error escalation test.
Important Files Changed
refresh_system_prompt()dropsmodeparam; newruntime_prompt_message()and helpers project mode+approval at request time via a transient user-role message;messages_with_turn_metadata()now appends the runtime prompt on every call.compose_promptparts array; stable prompt now contains only base + personality; functions madepub(crate); tests updated to assert mode/approval are absent from the base prompt.mode: AppModeparameters removed from replan and replay functions;refresh_system_prompt()calls updated.refresh_system_prompt()and checkpoint calls drop themodeargument;messages_with_turn_metadata()updated to append the runtime prompt._mode: AppModeparameter removed fromshould_default_defer_toolandapply_native_tool_deferral;modestill used inensure_advanced_toolingfor Plan-mode gating.Op::SetModelupdated; no functional change.system_prompt_overrideupdated; no functional change.Reviews (12): Last reviewed commit: "feat(cache): project mode prompts per re..." | Re-trigger Greptile