Add agentic_search MCP tool backed by the Cosmos Retriever service#135
Open
aryan-410 wants to merge 8 commits into
Open
Add agentic_search MCP tool backed by the Cosmos Retriever service#135aryan-410 wants to merge 8 commits into
aryan-410 wants to merge 8 commits into
Conversation
added 2 commits
June 26, 2026 21:57
FastAPI service (POST /search, GET /health) wrapping CosmosRetriever, which runs a multi-turn retrieval agent over a Cosmos DB corpus. Pluggable inference backend: harmony_vllm (fine-tuned pat-jj/harness-1), openai_chat (any OpenAI-compatible chat model), or openai_responses (reasoning models such as gpt-5.4 on Azure AI Foundry). Includes tests for the server and agent loops. The .NET agentic_search tool calls this service over HTTP.
Add a 9th MCP tool, agentic_search, that runs the Cosmos Retriever agent over a Cosmos DB corpus and returns ranked, curated documents. - AgenticSearchExecutor: calls the cosmos-retriever service over HTTP (COSMOS_RETRIEVER_URL, COSMOS_RETRIEVER_TIMEOUT_S); always returns parseable JSON (error envelope on failure). - Wire into Program.cs, MCPProtocolController (tools/list + tools/call), MCPTestController, and McpToolRequestValidator. - CosmosClientFactory: exclude ManagedIdentityCredential (fall through to az login); accept the standard MCP _meta params field. - Docs: docs/AGENTIC_SEARCH.md, README + CHANGELOG + .env.example.
aryan-410
commented
Jun 30, 2026
| self.corpus: CorpusConfig = self.settings.resolve_corpus(corpus_name) | ||
|
|
||
| self._enc: HarmonyEncoding = load_harmony_encoding(HarmonyEncodingName.HARMONY_GPT_OSS) | ||
| self._tiktoken = tiktoken.get_encoding("o200k_harmony") |
Author
There was a problem hiding this comment.
Should be configurable, incase of alternatives.
…rness - Move VllmTokenCompleter + run_single_episode into inference/vllm_policy.py - Delete inference/evaluate_harness1_vllm.py (eval/benchmark code) - Repoint retriever.py to vllm_policy; update env_rl docstring - Include pre-existing: pool_doc_ids trajectory pooling (openai_chat), optional baseten import (rerank)
…s to 1-50, de-brand harness references
The datagen/ package deletion was previously only staged, never committed, so it still appeared in the PR. Actually remove it (search_dataset.py, generate_sft_rl_splits.py, BrowseComp-Plus, README, __init__) along with the unit tests folder, the datagen TYPE_CHECKING import in tasks.py, the stale datagen comment in config.py, and the now-dangling pytest/respx dev deps and pytest/ruff test config in pyproject.toml.
…lResult Add a trajectory field to RetrievalResult populated by the harmony_vllm backend: the search queries issued (search_history), per-turn tool calls (turn_tools), programmatic per-turn status summaries (turn_summaries), and the final per-doc importance tags from curation (curated_importance).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a 9th MCP tool,
agentic_search, that runs a multi-turn retrieval agent over a Cosmos DB corpus and returns ranked, curated documents (hybrid vector + full-text RRF search, optional rerank, multi-turn read/prune).Commits
POST /search,GET /health) wrappingCosmosRetriever. Pluggable inference backend:harmony_vllm(fine-tunedpat-jj/harness-1),openai_chat(any OpenAI-compatible chat model), oropenai_responses(reasoning models such as gpt-5.4 on Azure AI Foundry). Includes tests for the server and agent loops.agentic_searchMCP tool (.NET) —AgenticSearchExecutorcalls the service over HTTP (COSMOS_RETRIEVER_URL,COSMOS_RETRIEVER_TIMEOUT_S) and always returns parseable JSON (error envelope on failure). Wired intoProgram.cs,MCPProtocolController(tools/list + tools/call),MCPTestController, andMcpToolRequestValidator.CosmosClientFactoryexcludesManagedIdentityCredential(falls through toaz login) and accepts the standard MCP_metaparams field. Docs:docs/AGENTIC_SEARCH.md, README + CHANGELOG +.env.example.Testing
dotnet buildclean; executor unit tests pass./mcp/httptools/callpath against a live Cosmos corpus with gpt-5.4 (Azure AI Foundry) via the responses backend.