Audit Together AI models and reasoning controls by rekram1-node · Pull Request #2135 · anomalyco/models.dev

rekram1-node · 2026-06-10T21:05:23Z

Summary

audit every Together AI serverless chat model against official provider catalog, exact model pages, changelog, reasoning docs, and deprecation schedule
add 9 missing active models, yielding coverage of all 22 currently listed serverless chat models
retain 7 historical files as deprecated and preserve capabilities independently of lifecycle
encode reasoning controls only where Together documents exact provider behavior

Evidence

Current catalog: https://docs.together.ai/docs/serverless/models
Serverless reasoning table and controls: https://docs.together.ai/docs/inference/chat/reasoning
Dated pricing/releases: https://docs.together.ai/docs/changelog
Deprecations: https://docs.together.ai/docs/deprecations
Google Gemma 4 exact page: https://www.together.ai/models/gemma-4-31b
Pearl Gemma 4 exact page: https://www.together.ai/models/gemma-4-31b-it-pearl
MiniMax M2.5 exact page: https://www.together.ai/models/minimax-m2-5
Qwen3 235B Instruct exact page: https://www.together.ai/models/qwen3-235b-a22b-instruct-2507-fp8
DeepSeek V4 Pro controls: https://docs.together.ai/docs/deepseek-v4-quickstart

Reasoning semantics

MiniMax M2.7 and deprecated DeepSeek R1: fixed reasoning (reasoning_options = []) based on positive fixed-mode evidence.
Documented hybrid models use toggle; GPT-OSS uses exact low, medium, high; DeepSeek V4 Pro uses toggle plus exact high, max.
Nemotron retains its verified hybrid toggle. Its medium/high depth switch is intentionally not encoded as generic effort because Together exposes chat_template_kwargs={"medium_effort": true}, not reasoning_effort, and the generic schema cannot represent that transport-specific boolean accurately.
Google and Pearl Gemma 4 exact endpoint pages explicitly describe configurable thinking, so both retain reasoning = true. Their omission from Together’s serverless reasoning table conflicts with those exact pages, and no current first-party wire parameter or accepted values were found; reasoning_options is intentionally omitted.
Deprecated MiniMax M2.5 retains reasoning = true because its exact endpoint page classifies it as reasoning. No positive Together evidence establishes fixed/toggle/effort/budget semantics, so options remain unresolved rather than fabricated.
Qwen3 235B A22B Instruct 2507 remains reasoning = false: Together recommends it for reasoning workloads and markets reasoning aptitude, but does not document reasoning-token output or a fixed/toggle/effort/budget mode.
All reasoning = false models omit reasoning_options.

Metadata corrections

DeepSeek V4 Pro: $1.74 input / $3.48 output / $0.20 cached input, effective June 9.
Google Gemma 4: $0.39 input / $0.97 output, effective May 21; structured outputs enabled. The current catalog table still shows the pre-May-21 price, while the dated changelog and exact page show the effective values used here.

Matrix

29 files: 22 active, 7 deprecated.
Active: 14 reasoning, 8 non-reasoning. Controls: 1 fixed, 8 toggle, 2 effort, 1 toggle+effort, 2 unresolved.
Deprecated: 4 reasoning, 3 non-reasoning. Controls: 1 fixed, 2 toggle, 1 unresolved.
Explicit unresolved allowlist: google/gemma-4-31B-it, pearl-ai/gemma-4-31b-it, MiniMaxAI/MiniMax-M2.5.

Verification

bun validate
bun test packages/core/test/sync-runner.test.ts (9 passed)
full 29-file matrix check with explicit unresolved allowlist
generated Together output assertions for corrected reasoning, status, structured output, and prices
git diff --check
no Together API credential was available, so no live inference tests were run

Unresolved gaps

Gemma 4: exact pages establish configurable thinking, but current reasoning docs omit the endpoints and no exact Together wire control is documented.
MiniMax M2.5: historical reasoning capability is established, but exact control semantics are not.
Nemotron medium/high depth is documented but is not representable by current generic reasoning option types without mislabeling it as reasoning_effort.

rekram1-node added 3 commits June 10, 2026 16:05

Audit Together AI model catalog and reasoning

09c7f86

Correct Together Qwen reasoning metadata

cb4fd81

Correct Together reasoning and pricing metadata

8807bb0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audit Together AI models and reasoning controls#2135

Audit Together AI models and reasoning controls#2135
rekram1-node wants to merge 3 commits into
devfrom
audit/togetherai-reasoning-20260610

rekram1-node commented Jun 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rekram1-node commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Evidence

Reasoning semantics

Metadata corrections

Matrix

Verification

Unresolved gaps

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rekram1-node commented Jun 10, 2026 •

edited

Loading