-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Pull requests: EleutherAI/lm-evaluation-harness
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix RACE doc_to_text keeping blank marker and dropping the question body
#3716
opened Apr 19, 2026 by
Chessing234
Loading…
[fix] Disambiguate cache entries for repeated generate_until requests (#3046)
#3715
opened Apr 17, 2026 by
FazeelUsmani
Loading…
2 tasks done
Fix DummyLM.generate_until write_out printing context as gen_kwargs
#3714
opened Apr 17, 2026 by
Chessing234
Loading…
Fix infinite loop in ruler qa generate_samples when used_docs cannot shrink
#3713
opened Apr 17, 2026 by
Chessing234
Loading…
Fix DummyLM.generate_until printing context as gen_kwargs
#3711
opened Apr 16, 2026 by
Chessing234
Loading…
Add diagnostic columns for
answer-not-found and invalid-filter tracking
#3709
opened Apr 15, 2026 by
fxmarty-amd
Contributor
Loading…
2 tasks
2
Fix MultiChoiceRegexFilter.find_match IndexError on all-empty capture groups
#3708
opened Apr 15, 2026 by
Chessing234
Loading…
Add OpenSubtitles2024 multi40 task configs and documentation
#3706
opened Apr 15, 2026 by
hengyu-luo
Loading…
Add LICA-Bench: graphic design VLM evaluation (39 tasks, 7 domains)
#3705
opened Apr 15, 2026 by
purvanshi
Loading…
3 tasks
fix: predict all CoQA turn answers instead of only the last turn
#3704
opened Apr 14, 2026 by
rahulraj-jhawar-devrev
•
Draft
Fix BigBench multiple-choice crash on mixed-format tasks
#3702
opened Apr 13, 2026 by
Chessing234
Loading…
refactor(vllm): remove deprecated vLLM V0 ray code path
#3701
opened Apr 13, 2026 by
Anai-Guo
Loading…
fix: don't pass task stop sequences to vLLM for reasoning models
#3700
opened Apr 12, 2026 by
jwmacd
Loading…
Fix median aggregation returning arbitrary element instead of median
#3696
opened Apr 12, 2026 by
Chessing234
Loading…
1 of 2 tasks
Fix acc_all_stderr grouping by question_id only (drops paragraph_id)
#3695
opened Apr 11, 2026 by
Chessing234
Loading…
Fix mmlu_pro fewshot answers leaking into user role under chat template
#3693
opened Apr 9, 2026 by
kiwaku
Loading…
[Feat] Add native Tensor Parallelism support for HF backend
#3692
opened Apr 9, 2026 by
YangKai0616
Loading…
Fix GPQA preprocessing: remove bracket-stripping regex that corrupts answer text
#3691
opened Apr 8, 2026 by
Robby955
Loading…
fix(vllm): guard against None prefix_token_id in tok_encode
#3687
opened Apr 8, 2026 by
Darcy-Lee
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.