Add FusionGateRouter — a route-vs-fuse meta-router (openrouter:fusion, zero core edits)#180
Open
ConsultingFuture4200 wants to merge 1 commit into
Open
Conversation
- Gate each query between single-model routing and OpenRouter openrouter:fusion (panel + judge), with a three-tier dial: single / budget_fusion / fusion - Isolate the beta openrouter:fusion server tool behind FusionExecutor (one blast point); graceful judge-failure fallback; per-query dollar cost_ceiling - Capability-scored panel selection with Quality/Budget preset fallback - --route-only spend-free preview; 6+ config keys; secret-scrubbed fusion logging producing FusionFactory-style training rows; offline retrain step - Three-arm offline eval harness + fixtures (mock = zero spend); 42 tests - Zero core edits; one optional provider; local fan-out fallback left as follow-up
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add FusionGateRouter — a route-vs-fuse meta-router
Summary
Adds
FusionGateRouter, a self-contained custom router plugin undercustom_routers/fusion_gate/that gates each query between the cheapsingle-model path and a multi-model fusion path, with fusion delegated to
OpenRouter's
openrouter:fusionserver tool. Zero edits to corellmrouter/code — the plugin is auto-discovered via the existing
custom_routers/mechanism, exactly like
randomrouterandthresholdrouter.Motivation
LLMRouter today picks which single model answers a query. The interesting
lever for hard queries is a different one: route vs. fuse — decide whether a
query is worth running a panel of models and synthesizing their answers. This PR
makes route-vs-fuse the primary per-query dial, expressed as a three-tier
escalation driven by estimated difficulty:
Cheap queries stay cheap; only the hard ones escalate, and the middle tier lets
mid-difficulty queries fuse on a budget panel instead of jumping straight to the
full Quality panel.
What's included
In scope:
FusionGateRouter— the route-vs-fuse gate (difficulty + confidence) plus capability-scored panel selection with a Quality/Budget preset fallback.openrouter:fusionadapter (executor.py) — the single, isolated blast point for the beta server-tool API.threshold,k,judge,provider/base_url,panel_preset,cost_ceiling,est_completion_tokens) and a--route-onlyspend-free preview that returns the decision + intended panel/judge without any API call.cost_ceiling) that downgrades fusion → single when the projected spend exceeds the cap.fusion_log.py) producing FusionFactory-style(query, model, response, performance)training rows.eval/) and an offline retrain step.Out of scope (follow-ups):
--route-onlyis exercisable. The executor interface is the seam a provider-agnostic local fan-out path would slot behind later — happy to add it if maintainers want it.Eval results
Dataset: 16 held-out queries (6 easy + 10 hard; GSM8K / MATH / GPQA / MBPP).
Quality / blended cost / escalation
pare over the full 16-query dataset; gateprecision is computed over the same fixed 10-query hard slice for every arm so the
arms are comparable (
always_routemakes no escalation decision → N/A). Slicedefinitions are documented in
eval/RESULTS.md. Blended cost is an estimatedper-query dollar amount.
FusionFactory & continual learning
Each fusion call yields a panel of per-model responses plus a judge synthesis —
exactly the
(query, model, response, performance)observations FusionFactoryneeds.
fusion_log.to_training_rowsdecomposes them into rows shaped forllmrouter/data/api_calling_evaluation.py, and the retrain step replays thelogged sink to refit the gate thresholds offline. This directly serves the
repo's continual-learning TODO: the router's own fusion traffic becomes the
training signal that sharpens the route-vs-fuse gate over time, with no separate
labeling pass required.
Beta server-tool caveat
openrouter:fusionis an OpenRouter BETA server tool; its request/responseshape may change. All OpenRouter HTTP specifics are confined to
executor.py(request body, tool type, key resolution, transport, payload parsing), so an
upstream beta change touches one file. The executor degrades gracefully on judge
failure (synthesizes from panel responses). No API keys, auth headers, or raw
provider payloads are ever logged.
Testing
Torch-free, fully offline (HTTP mocked):