-
Notifications
You must be signed in to change notification settings - Fork 76
Pull requests: ROCm/FlyDSL
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[layernorm] Add backward pass for training (PR 3/3, #769)
#801
opened Jul 3, 2026 by
jhinpan
Contributor
Loading…
[rmsnorm] Add fused-add / residual backward (PR 2/3, #769)
#800
opened Jul 3, 2026 by
jhinpan
Contributor
Loading…
[Fix] Consistent promotion rules for Vector and Numeric
#797
opened Jul 3, 2026 by
sjfeng1999
Collaborator
Loading…
1 task
[rmsnorm] Add backward pass + forward store_rstd for training (PR 1/3, #769)
#795
opened Jul 3, 2026 by
jhinpan
Contributor
Loading…
[Kernel] conv3d: 8-wave double-buffered FP8 implicit-GEMM kernel(gfx950)
#794
opened Jul 1, 2026 by
jiacao-amd
Loading…
2 tasks
[Perf] preshuffle_gemm: vectorize scale_a epilogue load (4x32b -> 1x128b)
#791
opened Jul 1, 2026 by
coderfeli
Collaborator
Loading…
[Enh] Add finer FastMathFlag control
#789
opened Jul 1, 2026 by
sjfeng1999
Collaborator
Loading…
1 task
[4/5] autotune: CI guard + committed-config regression check (#770)
#788
opened Jul 1, 2026 by
jhinpan
Contributor
Loading…
[3/5] autotune: offline config emit + runtime lookup (#770)
#786
opened Jul 1, 2026 by
jhinpan
Contributor
Loading…
[2/5] autotune: two-track config + first real adopter (rmsnorm) (#770)
#785
opened Jul 1, 2026 by
jhinpan
Contributor
Loading…
[1/5] autotune: harden cache key + add restore_value (#770)
#783
opened Jul 1, 2026 by
jhinpan
Contributor
Loading…
[Kernel] Add compile_mxfp6_gemm: MXFP6×MXFP4 preshuffle GEMM (gfx950)
#780
opened Jun 30, 2026 by
amd-satre
Contributor
Loading…
1 task done
[Feat]: Add dispatch & combine tuning configuration
multi-gpu
#772
opened Jun 30, 2026 by
yanboshao
Contributor
Loading…
1 task
flash_attn_generic: replace raw arith.* FP ops with FlyDSL-typed fast…
#764
opened Jun 29, 2026 by
xudoyuan
Collaborator
Loading…
1 task
FlyDSL gemm_decode: small-M dense GEMM kernels (BF16/FP8/blockscale)
#757
opened Jun 26, 2026 by
vedenev-amd
•
Draft
1 task
feat(moe): layout-API MXFP4 (a4w4/a8w4) MoE gemm
#753
opened Jun 26, 2026 by
coderfeli
Collaborator
Loading…
[Fix] Set identity block scales for CDNA4 MFMA_Scale in fp8 row-scale…
#744
opened Jun 25, 2026 by
amd-songpiao
Loading…
[Kernel] conv2d: 8-wave double-buffered implicit-GEMM BF16 kernel(gfx950)
#733
opened Jun 24, 2026 by
jiacao-amd
Loading…
1 task
[Kernel] feat: Add MXFP6-E2M3 activation support to mixed_moe_gemm_2stage
#709
opened Jun 19, 2026 by
amd-satre
Contributor
Loading…
1 task done
Previous Next
ProTip!
Adding no:label will show everything without a label.