Skip to content

aidanq06/WebBench

Repository files navigation

webbench

benchmarks small language models (0.6B–8B) on MMLU-style multiple-choice questions, entirely in the browser. the model is downloaded to your device and runs on your GPU through WebLLM and WebGPU — no server-side inference, no API keysz

you get an accuracy score with a 95% Wilson confidence interval, the full raw output for every question, and a shareable report. runs are anonymously added to a public results page.

Screenshot 2026-06-05 at 2 41 15 PM

stack

next.js, react, tailwind, framer motion, zustand, @mlc-ai/web-llm, supabase.

questions from the MMLU benchmark (Hendrycks et al., 2021). in-browser inference via WebLLM.

About

LLM benchmarking within your browser

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors