Harden dedup/reconcile pipeline#26
Open
aayush3011 wants to merge 1 commit into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens the memory extraction/dedup/reconcile pipeline across sync, async, and Durable Function App paths by introducing a persisted extraction watermark, improving vector-distance awareness, and simplifying search to always attempt hybrid (vector + full-text) ranking with a safe fallback to vector-only.
Changes:
- Added persisted per-thread extraction watermark (
last_extract_count) to sizerecent_kand advance only after successful extract→persist, preventing stranded turns after transient failures. - Implemented/validated a vector-dedup “ladder” and candidate-mode reconcile behavior, including distance-function awareness (cosine/dotproduct vs euclidean) and a persisted-counter cadence for periodic full-pool backstops.
- Removed
hybrid_searchflag and switched search to automatic keyword extraction with a CosmosFullTextScoreterm cap and vector-only fallback for all-stopword queries.
Reviewed changes
Copilot reviewed 63 out of 63 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_utils.py | Adds unit coverage for keyword extraction, vector-distance helpers, and container policy distanceFunction reads. |
| tests/unit/test_thresholds.py | Adds coverage for env-backed threshold getters and internalized (non-env) dedup/search constants. |
| tests/unit/test_reconcile.py | Pins legacy reconcile paths for existing tests; removes extract-time UPDATE tests. |
| tests/unit/test_process_now.py | Updates expectations for fact+episodic reconcile being invoked. |
| tests/unit/test_procedural_synthesis.py | Pins legacy extract dedup knobs for stability. |
| tests/unit/test_pipeline_confidence.py | Pins legacy extract dedup knobs for stability. |
| tests/unit/test_cosmos_memory_client.py | Updates constructor/serverless autoscale behavior and hybrid search SQL expectations; adds forwarding tests. |
| tests/unit/test_auto_trigger.py | Adds watermark-driven recent_k tests and persisted-counter full-rebuild cadence tests. |
| tests/unit/store/test_memory_store.py | Verifies hybrid SQL uses @kwN params and stopword fallback to vector-only. |
| tests/unit/services/test_pipeline_service.py | Updates extract behavior to “add-only” facts; pins legacy dedup/reconcile mode for tests. |
| tests/unit/services/test_extract_dry.py | Adds dry-extract stage-1 search behavior tests and async mirrors. |
| tests/unit/services/test_dedup_vector.py | Adds extensive sync unit coverage for vector dedup ladder + candidate-mode reconcile. |
| tests/unit/services/test_chaos_extract_persist.py | Pins extract dedup knobs across sync/async for chaos tests. |
| tests/unit/processors/test_protocol_satisfaction.py | Updates processor protocol to accept recent_k. |
| tests/unit/processors/test_inprocess.py | Updates in-process processor behavior for fact+episodic reconcile and recent_k plumbing. |
| tests/unit/function_app/test_orchestrators.py | Updates orchestration chain to Extract→Dedup→Persist and reconciles fact+episodic; adds watermark advance activity tests. |
| tests/unit/function_app/test_change_feed.py | Adds watermark-based recent_k assertions and persisted-counter full_rebuild cadence tests. |
| tests/unit/aio/test_reconcile_telemetry.py | Pins legacy async reconcile mode for telemetry tests. |
| tests/unit/aio/test_process_now.py | Updates async process_now expectations for fact+episodic reconcile being awaited. |
| tests/unit/aio/test_cosmos_memory_client.py | Updates async hybrid search SQL expectations and forwarding tests; serverless autoscale ignore behavior. |
| tests/unit/aio/test_auto_trigger.py | Adds async watermark recent_k tests and async full-rebuild cadence tests. |
| tests/unit/aio/services/test_dedup_vector_async.py | Adds extensive async unit coverage for vector dedup ladder + candidate-mode reconcile. |
| tests/unit/aio/processors/test_protocol_satisfaction.py | Updates async processor protocol to accept recent_k. |
| tests/unit/aio/processors/test_inprocess.py | Updates async in-process processor behavior for fact+episodic reconcile and recent_k plumbing. |
| tests/integration/test_processor_integration.py | Updates sync integration to expect fact+episodic reconcile calls. |
| tests/integration/test_processor_integration_async.py | Updates async integration to expect fact+episodic reconcile calls. |
| tests/integration/test_full_pipeline.py | Removes hybrid_search flag usage; adds live integration for extract-time vector dedup. |
| tests/integration/test_async_full_pipeline.py | Adds new async live integration smoke test mirroring sync behavior. |
| Samples/Notebooks/Demo_async.ipynb | Removes hybrid_search flag usage in the async notebook demo. |
| Samples/Advanced/advanced_search_patterns.py | Updates narrative to reflect hybrid-by-default search; removes flag usage. |
| function_app/triggers/change_feed.py | Computes recent_k from persisted watermark and sets persisted-counter full-rebuild cadence. |
| function_app/shared/counters.py | Preserves last_extract_count; adds read/advance watermark helpers. |
| function_app/shared/config.py | Removes unused float parsing helper. |
| function_app/orchestrators/extract_memories.py | Inserts Dedup activity; forwards recent_k and full_rebuild; advances watermark post-persist. |
| function_app/local.settings.json.template | Adds DEDUP_EVERY_N to the template. |
| Docs/troubleshooting.md | Updates configuration guidance (no hybrid flag; throughput guidance via client args). |
| Docs/public_api.md | Updates public API docs to remove hybrid_search argument and describe fallback behavior. |
| Docs/design_patterns.md | Removes hybrid_search flag from examples. |
| Docs/concepts.md | Documents watermarking, vector-floor ladder, dual-mode reconcile, and hybrid-search behavior. |
| azure/cosmos/agent_memory/thresholds.py | Internalizes several dedup/search knobs as fixed constants with accessor functions. |
| azure/cosmos/agent_memory/store/memory_store.py | Switches to keyword extraction and hybrid SQL driven by extracted terms. |
| azure/cosmos/agent_memory/store/_search_helpers.py | Builds hybrid SQL using per-keyword @kwN parameters with vector-only fallback. |
| azure/cosmos/agent_memory/services/_pipeline_helpers.py | Improves LLM JSON parse errors with truncation heuristics and clearer guidance. |
| azure/cosmos/agent_memory/prompts/extract_memories.prompty | Removes extract-time UPDATE/CONTRADICT schema and increases maxOutputTokens; clarifies speaker discrimination. |
| azure/cosmos/agent_memory/prompts/dedup.prompty | Increases maxOutputTokens. |
| azure/cosmos/agent_memory/prompts/dedup_episodic.prompty | Adds new episodic merge-only reconcile prompt. |
| azure/cosmos/agent_memory/prompts/_schemas.py | Adds episodic dedup schema; removes action/supersedes fields from extraction schema. |
| azure/cosmos/agent_memory/processors/inprocess.py | Adds recent_k plumbing and fact+episodic reconcile routing with optional full_rebuild. |
| azure/cosmos/agent_memory/processors/durable.py | Extends protocol to accept recent_k and full_rebuild (no-op). |
| azure/cosmos/agent_memory/processors/base.py | Extends processor protocol with recent_k and full_rebuild. |
| azure/cosmos/agent_memory/cosmos_memory_client.py | Removes hybrid_search plumbing; forwards episodic search options; reconcile uses full rebuild. |
| azure/cosmos/agent_memory/auto_trigger.py | Uses persisted watermark for recent_k; persisted-counter full-rebuild cadence; advances watermark on success only. |
| azure/cosmos/agent_memory/aio/store/memory_store.py | Async mirror of keyword-extraction hybrid search behavior. |
| azure/cosmos/agent_memory/aio/processors/inprocess.py | Async mirror of processor changes for recent_k and fact+episodic reconcile routing. |
| azure/cosmos/agent_memory/aio/processors/durable.py | Async protocol extension (no-op). |
| azure/cosmos/agent_memory/aio/processors/base.py | Async protocol extension. |
| azure/cosmos/agent_memory/aio/cosmos_memory_client.py | Async mirror of client search/reconcile changes. |
| azure/cosmos/agent_memory/aio/auto_trigger.py | Async mirror of watermark recent_k and persisted-counter full-rebuild cadence. |
| azure/cosmos/agent_memory/_utils.py | Adds keyword extraction (stopwords + 30-term cap) and distance-function utilities. |
| azure/cosmos/agent_memory/_counters.py | Adds read/advance extract watermark helpers; preserves watermark across updates. |
| .env.template | Removes throughput/embedding-distance env knobs now expected to be passed explicitly as client args. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
162
to
165
| ### Tunable | ||
|
|
||
| `DEDUP_EVERY_N` (default 5) controls how often `reconcile_memories` runs in the auto-trigger path. Set to `0` to disable. The candidate cap `n` (default 50) is tunable per call; larger values give the LLM a wider view at higher token cost. | ||
| `DEDUP_EVERY_N` (default 5) controls how often reconcile runs in the auto-trigger path. Set to `0` to disable. The candidate cap `n` (default `DEDUP_POOL_SIZE`, 50) is tunable per call; larger values give the LLM a wider view at higher token cost. `DEDUP_FULL_RECLUSTER_EVERY_N` (default 12) sets how often the full-pool backstop fires. | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Hardens the memory dedup/reconcile pipeline and simplifies the search surface. Sync, async, and durable (Function App) paths are kept in lockstep.
Changes
recent_kfrom a persisted per-thread watermark (last_extract_counton the counter doc) instead of a fixed window. The watermark advances only after a successful extract, so transient extract failures no longer strand turns.sys:dup-candidatefor the LLM reconcile. Stale tags on seeds that never cluster are cleared.distanceFunctionand disables the cosine-calibrated auto-drop for euclidean.hybrid_searchflag; everysearch_cosmoscall now fuses vector + BM25 automatically (keyword extraction with graceful fall-back to pure vector).Testing