fix(memory): bound mate_memory_recall.filename to VARCHAR(256) (#461)#463
Merged
mateaix merged 1 commit intoJul 1, 2026
Merged
Conversation
…ix#461) mate_memory_recall.filename is VARCHAR(256), but the snippet-level recall tracker assembles the key as `path + '#' + H2-heading-slug`. When the LLM writes an over-long daily-note heading (the summarize prompt placed no length cap on the `##` title), the CJK-preserving slug pushes the filename past the column, and writes fail with Data too long / string too long. Three layers of defence, root cause + hard caps: 1. prompt (source) — summarize-system.txt now asks for short (≤30 chars) `##` titles; details go in the body, not the heading. 2. slug cap (close to source) — MemoryRecallTracker.sanitizeSectionKey caps the slug at MAX_SECTION_SLUG=200, leaving path+'#' well under 256. 3. write-side cap (catches every path) — MemoryRecallService.recordRecall truncates filename to MAX_FILENAME_LENGTH=255 at the entry point, so the select/insert/update branches share one value and the dup-key concurrency fallback still matches. Covers trackActiveRetrieval too, which bypasses sanitizeSectionKey. Tests: MemoryRecallFilenameTruncationTest covers both caps (over-long CJK heading, normal heading untouched, ascii slug, date prefix survives) plus an end-to-end assertion that the stored value fits VARCHAR(256). Existing memory-suite unit tests still green.
Owner
|
已合并,感谢 🙏 教科书式的干净修复——根因定位准确,三层防护(prompt 约束 → slug 截断 200 → 写库入口截断 255)治本 + 硬兜底,思路和取舍都很到位。 复核确认:
无需改动直接合入,这次没有 follow-up。再次感谢这个高质量修复 👏 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #461
根因
mate_memory_recall.filename是VARCHAR(256),但片段级召回追踪把 key 拼成文件路径 + '#' + H2标题slug(MemoryRecallTracker.java:121)。summarize-system.txt的 prompt 对##标题没有长度约束,LLM 偶尔把一整段事件细节写成标题,经sanitizeSectionKey()(CJK 中文被原样保留、不截断)后,叠加文件路径前缀突破 256,写库报Data too long/ 字符串超长。MemoryRecallService.recordRecall写入前也无截断。详细分析见 #461。
改动(三层防护,治本 + 硬兜底)
prompts/memory/summarize-system.txt##标题保持简短(≤30 字),细节写进正文而非标题MemoryRecallTracker.javasanitizeSectionKey返回值截断到MAX_SECTION_SLUG=200,为路径(~25)+#(1)留足余量MemoryRecallService.javarecordRecall入口截断到MAX_FILENAME_LENGTH=255,覆盖所有调用路径为什么第 2、3 层都要
trackActiveRetrieval(MemoryRecallTracker.java:148)直接透传工具调用传入的 filename 调recordRecall,不走sanitizeSectionKey,必须靠第 3 层统一兜底;recordRecall是写入mate_memory_recall的唯一入口,截断放在入口处,保证方法内 select / insert / update 三个分支用同一个值,避免"查不到旧记录→又插入→又被截断"的失配,并发兜底(DuplicateKeyException路径)也能正确匹配。不改 schema
VARCHAR(256)不动。系统其他读取方对#锚点都是"截断丢弃"处理(MemoryRecallService.computeFreshness的indexOf('#')、FactController的split("#",2)),截断 slug 不影响日期解析与召回/新鲜度计算。验证
MemoryRecallFilenameTruncationTest(纯 JUnit 5,不启动 Spring,对齐AlwaysOnFileBudgetTest风格):7 个用例全过sanitizeSectionKey:超长中文标题 slug ≤200 且不抛异常;正常中文标题不误截断(含 CJK);ascii 标题折叠正确truncateFilename:超 255 截断到 255;≤255 原样返回;截断后日期前缀存活(computeFreshness仍能解析)AlwaysOnFileBudgetTest、MemorySummarizationGateTest、MemoryHilServiceTest、DreamFlagGuardTest、StructuredMemoryPrefetchTest等):35 个全过,0 失败 0 错误