Skip to content

Make sync models use better source for baseten provider mappings#845

Merged
Erin McNulty (erin2722) merged 4 commits into
mainfrom
fix/baseten-provider-mappings
Jun 22, 2026
Merged

Make sync models use better source for baseten provider mappings#845
Erin McNulty (erin2722) merged 4 commits into
mainfrom
fix/baseten-provider-mappings

Conversation

@erin2722

Copy link
Copy Markdown
Contributor

Baseten models have been lagging a lot on https://models.litellm.ai/, and so we kept missing the newest models. Baseten provides a list models endpoint though, so we can use that to deterministically grab the newest models.

Erin McNulty (erin2722) and others added 4 commits June 22, 2026 17:49
Baseten's Model APIs /v1/models serves deepseek-ai/DeepSeek-V4-Pro and
moonshotai/Kimi-K2.7-Code under the exact same ids the catalog already uses for
Together, but available_providers listed only together. Verified both invoke on
Baseten directly (HTTP 200 via inference.baseten.co/v1/chat/completions with the
CI org key). Union baseten into available_providers + AvailableEndpointTypes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…eten models

Baseten serves NVIDIA Nemotron 3 Ultra only under the mixed-case id
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B (the catalog's lowercase
nvidia/nemotron-3-ultra-550b-a55b, used for Together, 404s on Baseten), so add a
separate baseten-cased entry with Baseten's verified metadata ($0.6/$2.4,
cache-read $0.12, 202800 ctx, reasoning). Validated HTTP 200 through the gateway.

Remove four baseten-only entries that Baseten now returns HTTP 410 'deprecated'
for (verified live), leaving them non-invocable on their only provider:
deepseek-ai/DeepSeek-V3-0324, moonshotai/Kimi-K2-Thinking,
moonshotai/Kimi-K2-Instruct-0905, zai-org/GLM-4.6.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
sync_models.ts only sourced models from LiteLLM, which lags Baseten's own
catalog (the GLM-5.2 / Kimi-K2.x misses). Add a sync-baseten command that
fetches Baseten's authoritative OpenAI-compatible /v1/models list and:
- adds models Baseten serves that are missing locally (with Baseten's pricing,
  context length, and reasoning/multimodal flags), and
- unions the baseten provider into available_providers + the index.ts
  AvailableEndpointTypes entry of models already present under the same id.

It is additive only and never prunes models absent from /v1/models, because that
list is not exhaustive (some served ids are unlisted) — removals stay manual.
Requires BASETEN_API_KEY. New exported helper addProviderToProviderMappingContent
widens existing index.ts entries (the existing helpers only add missing ones).
+4 vitest tests (23 pass); no new tsc errors (5 pre-existing unchanged).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a 'Sync Baseten models from /v1/models' step after the LiteLLM provider
sync, so new Baseten models and baseten provider unions land in the daily sync
PR (then flow through the existing changed-model collection, metadata
enrichment, canonicalize, and Codex review steps). Reads BASETEN_API_KEY from
repo secrets and skips cleanly when it is not configured, so the workflow stays
green until the key is added.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 22, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
ai-proxy Ready Ready Preview, Comment Jun 22, 2026 6:43pm

Request Review

@erin2722 Erin McNulty (erin2722) merged commit 791b91f into main Jun 22, 2026
7 of 8 checks passed
Erin McNulty (erin2722) added a commit that referenced this pull request Jun 23, 2026
Merged current main (which has #849 Baseten pricing + #845 baseten provider
source + the deprecated-model exclusions). The 3-way merge dropped all of this
batch's regressions (they reverted values main now owns) and kept only the
net-new metadata: chat-latest pricing, r1-1776 deprecation, the OpenAI dated-
snapshot deprecation dates, the gpt-5.x-pro input_cache_read cleanup, and the
groq gpt-oss-120b max_output bump. This commit fixes the ["openai","azure"]
comma-spacing prettier violation the bot reintroduced in index.ts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants