Make sync models use better source for baseten provider mappings#845
Merged
Conversation
Baseten's Model APIs /v1/models serves deepseek-ai/DeepSeek-V4-Pro and moonshotai/Kimi-K2.7-Code under the exact same ids the catalog already uses for Together, but available_providers listed only together. Verified both invoke on Baseten directly (HTTP 200 via inference.baseten.co/v1/chat/completions with the CI org key). Union baseten into available_providers + AvailableEndpointTypes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…eten models Baseten serves NVIDIA Nemotron 3 Ultra only under the mixed-case id nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B (the catalog's lowercase nvidia/nemotron-3-ultra-550b-a55b, used for Together, 404s on Baseten), so add a separate baseten-cased entry with Baseten's verified metadata ($0.6/$2.4, cache-read $0.12, 202800 ctx, reasoning). Validated HTTP 200 through the gateway. Remove four baseten-only entries that Baseten now returns HTTP 410 'deprecated' for (verified live), leaving them non-invocable on their only provider: deepseek-ai/DeepSeek-V3-0324, moonshotai/Kimi-K2-Thinking, moonshotai/Kimi-K2-Instruct-0905, zai-org/GLM-4.6. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
sync_models.ts only sourced models from LiteLLM, which lags Baseten's own catalog (the GLM-5.2 / Kimi-K2.x misses). Add a sync-baseten command that fetches Baseten's authoritative OpenAI-compatible /v1/models list and: - adds models Baseten serves that are missing locally (with Baseten's pricing, context length, and reasoning/multimodal flags), and - unions the baseten provider into available_providers + the index.ts AvailableEndpointTypes entry of models already present under the same id. It is additive only and never prunes models absent from /v1/models, because that list is not exhaustive (some served ids are unlisted) — removals stay manual. Requires BASETEN_API_KEY. New exported helper addProviderToProviderMappingContent widens existing index.ts entries (the existing helpers only add missing ones). +4 vitest tests (23 pass); no new tsc errors (5 pre-existing unchanged). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a 'Sync Baseten models from /v1/models' step after the LiteLLM provider sync, so new Baseten models and baseten provider unions land in the daily sync PR (then flow through the existing changed-model collection, metadata enrichment, canonicalize, and Codex review steps). Reads BASETEN_API_KEY from repo secrets and skips cleanly when it is not configured, so the workflow stays green until the key is added. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Ken Jiang (knjiang)
approved these changes
Jun 22, 2026
Erin McNulty (erin2722)
added a commit
that referenced
this pull request
Jun 23, 2026
Merged current main (which has #849 Baseten pricing + #845 baseten provider source + the deprecated-model exclusions). The 3-way merge dropped all of this batch's regressions (they reverted values main now owns) and kept only the net-new metadata: chat-latest pricing, r1-1776 deprecation, the OpenAI dated- snapshot deprecation dates, the gpt-5.x-pro input_cache_read cleanup, and the groq gpt-oss-120b max_output bump. This commit fixes the ["openai","azure"] comma-spacing prettier violation the bot reintroduced in index.ts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Baseten models have been lagging a lot on https://models.litellm.ai/, and so we kept missing the newest models. Baseten provides a list models endpoint though, so we can use that to deterministically grab the newest models.