Skip to content

Pull requests: Dao-AILab/flash-attention

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Disable 2CTA fwd non-causal on CUDA 12.9 to work around codegen regression
#2461 opened Apr 15, 2026 by Johnsonms Collaborator Loading…
3 tasks done
Add CLC scheduler heuristic
#2455 opened Apr 13, 2026 by drisspg Collaborator Loading…
Support batch invariant forward for FA3
#2450 opened Apr 10, 2026 by Edenzzzz Loading…
[Cute,Fwd,Sm90] Ceil div in paged kv manager to prevent size 0
#2446 opened Apr 8, 2026 by imbr92 Contributor Loading…
Add dropout support to CuTe DSL attention kernels
#2439 opened Apr 6, 2026 by blake-snc Contributor Loading…
10 of 12 tasks
[CuTe,Fwd,SM90] Enable head dim 512 for SM90
#2422 opened Apr 1, 2026 by IwakuraRein Loading…
Add compress_factor for compressed causal attention
#2418 opened Mar 31, 2026 by jduprat Contributor Loading…
[Cute,Fwd,Sm90] Support SplitKV
#2415 opened Mar 31, 2026 by imbr92 Contributor Loading…
chore(tests): move benchmarks to benchmarks/cute/ and reduce test prints
#2408 opened Mar 29, 2026 by NJX-njx Contributor Loading…
3 tasks
fix(flash_fwd_sm90): zero partial V smem to prevent 0*NaN=NaN in PV GEMM
#2407 opened Mar 29, 2026 by NJX-njx Contributor Loading…
3 tasks
feat: setup_context for FlashAttnFunc (torch.func.grad)
#2405 opened Mar 28, 2026 by NJX-njx Contributor Loading…
fix(cute): SM120 forward/bwd and atomic add compatibility
#2404 opened Mar 28, 2026 by NJX-njx Contributor Loading…
ProTip! What’s not been updated in a month: updated:<2026-03-15.