Add agentic_search MCP tool backed by the Cosmos Retriever service by aryan-410 · Pull Request #135 · AzureCosmosDB/MCPToolKit

aryan-410 · 2026-06-26T22:15:07Z

Summary

Adds a 9th MCP tool, agentic_search, that runs a multi-turn retrieval agent over a Cosmos DB corpus and returns ranked, curated documents (hybrid vector + full-text RRF search, optional rerank, multi-turn read/prune).

Commits

Vendor the Cosmos Retriever Python service — FastAPI service (POST /search, GET /health) wrapping CosmosRetriever. Pluggable inference backend: harmony_vllm (fine-tuned pat-jj/harness-1), openai_chat (any OpenAI-compatible chat model), or openai_responses (reasoning models such as gpt-5.4 on Azure AI Foundry). Includes tests for the server and agent loops.
Add agentic_search MCP tool (.NET) — AgenticSearchExecutor calls the service over HTTP (COSMOS_RETRIEVER_URL, COSMOS_RETRIEVER_TIMEOUT_S) and always returns parseable JSON (error envelope on failure). Wired into Program.cs, MCPProtocolController (tools/list + tools/call), MCPTestController, and McpToolRequestValidator. CosmosClientFactory excludes ManagedIdentityCredential (falls through to az login) and accepts the standard MCP _meta params field. Docs: docs/AGENTIC_SEARCH.md, README + CHANGELOG + .env.example.

Testing

.NET: dotnet build clean; executor unit tests pass.
Python: retriever test suites pass.
End-to-end verified through the MCP /mcp/http tools/call path against a live Cosmos corpus with gpt-5.4 (Azure AI Foundry) via the responses backend.

Note: the Cosmos Retriever Python service is vendored here so the tool is self-contained; happy to split it into a separate repo/submodule if maintainers prefer.

FastAPI service (POST /search, GET /health) wrapping CosmosRetriever, which runs a multi-turn retrieval agent over a Cosmos DB corpus. Pluggable inference backend: harmony_vllm (fine-tuned pat-jj/harness-1), openai_chat (any OpenAI-compatible chat model), or openai_responses (reasoning models such as gpt-5.4 on Azure AI Foundry). Includes tests for the server and agent loops. The .NET agentic_search tool calls this service over HTTP.

Add a 9th MCP tool, agentic_search, that runs the Cosmos Retriever agent over a Cosmos DB corpus and returns ranked, curated documents. - AgenticSearchExecutor: calls the cosmos-retriever service over HTTP (COSMOS_RETRIEVER_URL, COSMOS_RETRIEVER_TIMEOUT_S); always returns parseable JSON (error envelope on failure). - Wire into Program.cs, MCPProtocolController (tools/list + tools/call), MCPTestController, and McpToolRequestValidator. - CosmosClientFactory: exclude ManagedIdentityCredential (fall through to az login); accept the standard MCP _meta params field. - Docs: docs/AGENTIC_SEARCH.md, README + CHANGELOG + .env.example.

aryan-410 · 2026-06-30T04:42:24Z

+        self.corpus: CorpusConfig = self.settings.resolve_corpus(corpus_name)
+
+        self._enc: HarmonyEncoding = load_harmony_encoding(HarmonyEncodingName.HARMONY_GPT_OSS)
+        self._tiktoken = tiktoken.get_encoding("o200k_harmony")


Should be configurable, incase of alternatives.

…rness - Move VllmTokenCompleter + run_single_episode into inference/vllm_policy.py - Delete inference/evaluate_harness1_vllm.py (eval/benchmark code) - Repoint retriever.py to vllm_policy; update env_rl docstring - Include pre-existing: pool_doc_ids trajectory pooling (openai_chat), optional baseten import (rerank)

…s to 1-50, de-brand harness references

The datagen/ package deletion was previously only staged, never committed, so it still appeared in the PR. Actually remove it (search_dataset.py, generate_sft_rl_splits.py, BrowseComp-Plus, README, __init__) along with the unit tests folder, the datagen TYPE_CHECKING import in tasks.py, the stale datagen comment in config.py, and the now-dangling pytest/respx dev deps and pytest/ruff test config in pyproject.toml.

…lResult Add a trajectory field to RetrievalResult populated by the harmony_vllm backend: the search queries issued (search_history), per-turn tool calls (turn_tools), programmatic per-turn status summaries (turn_summaries), and the final per-doc importance tags from curation (curated_importance).

cosmos-dev added 2 commits June 26, 2026 21:57

aryan-410 commented Jun 30, 2026

View reviewed changes

aryan-410 added 6 commits June 30, 2026 05:01

chore: remove benchmark scripts and dev artifacts from toolkit

9db5640

chore: remove generated datagen/splits folder

905acc0

chore(toolkit): polish agentic_search descriptions, align maxDocument…

03aa2e0

…s to 1-50, de-brand harness references

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add agentic_search MCP tool backed by the Cosmos Retriever service#135

Add agentic_search MCP tool backed by the Cosmos Retriever service#135
aryan-410 wants to merge 8 commits into
AzureCosmosDB:mainfrom
aryan-410:feat/agentic-search-cosmos-retriever

aryan-410 commented Jun 26, 2026

Uh oh!

aryan-410 Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

aryan-410 commented Jun 26, 2026

Summary

Commits

Testing

Uh oh!

aryan-410 Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant