You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Shanghai AI Lab released Agents-A1, a 35B Mixture-of-Experts model that matches or beats trillion-parameter models (Kimi-K2.6, DeepSeek-V4-Pro) on long-horizon agent benchmarks. The key insight: instead of scaling parameters, they scaled agent horizons — training on 45K-token trajectories built from knowledge-action-observation chains, then distilling six specialized domain teachers into one deployable model.
⚙️ What It Means for Agentic Workflows
1. Smaller models + richer trajectories beat bigger models. If you're choosing a backbone for an automated workflow, a well-trained 35B model can outperform a 1T model on multi-step tasks — inference cost drops dramatically without sacrificing quality.
2. Trajectory quality is the new hyperparameter. Workflow designers should invest in building high-quality, long-horizon training trajectories (tool calls, observations, verifier feedback) rather than always reaching for a larger model. The data pipeline matters more than model size.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
🔬 The Finding
Shanghai AI Lab released Agents-A1, a 35B Mixture-of-Experts model that matches or beats trillion-parameter models (Kimi-K2.6, DeepSeek-V4-Pro) on long-horizon agent benchmarks. The key insight: instead of scaling parameters, they scaled agent horizons — training on 45K-token trajectories built from knowledge-action-observation chains, then distilling six specialized domain teachers into one deployable model.
⚙️ What It Means for Agentic Workflows
1. Smaller models + richer trajectories beat bigger models. If you're choosing a backbone for an automated workflow, a well-trained 35B model can outperform a 1T model on multi-step tasks — inference cost drops dramatically without sacrificing quality.
2. Trajectory quality is the new hyperparameter. Workflow designers should invest in building high-quality, long-horizon training trajectories (tool calls, observations, verifier feedback) rather than always reaching for a larger model. The data pipeline matters more than model size.
🔗 Source
Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent — June 29, 2026
Beta Was this translation helpful? Give feedback.
All reactions