Commit f6a16e1
authored
Allow compact block sparse index tensors (#2417)
* Allow compact block sparse index tensors
Relax validation in block_sparsity.py to allow idx.shape[3] <= expected_n_blocks
instead of requiring exact equality.
FA4 only accesses indices 0..cnt-1 per query tile, so the index tensor's last
dimension does not need to be as large as ceil(seqlen_k / block_size_n). This
enables memory-efficient compact index tensors that avoid O(N^2) memory at long
sequence lengths (e.g., 1M+ tokens for sparse attention / NSA workloads).
Changes:
- _check_and_expand_block: accept compact n-block dimension and expand only the
batch/head/m-block dimensions
- infer_block_sparse_expected_shapes: change strict equality check to upper-bound
check (error only when n-blocks exceeds expected, not when smaller)
Backward compatible: existing code that passes full-sized tensors is unaffected.
* Add test for compact block sparse index tensors
Verify that truncating block sparse index tensors to idx.shape[3] = max(cnt)
(instead of the full ceil(seqlen_k / block_size_n)) produces bit-identical
output to full-sized tensors. This validates the relaxed validation from
the previous commit.1 parent 29e40cf commit f6a16e1
File tree
2 files changed
+96
-2
lines changed- flash_attn/cute
- tests/cute
2 files changed
+96
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
101 | 107 | | |
102 | 108 | | |
103 | 109 | | |
| |||
200 | 206 | | |
201 | 207 | | |
202 | 208 | | |
203 | | - | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
204 | 212 | | |
205 | | - | |
| 213 | + | |
206 | 214 | | |
207 | 215 | | |
208 | 216 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1712 | 1712 | | |
1713 | 1713 | | |
1714 | 1714 | | |
| 1715 | + | |
| 1716 | + | |
| 1717 | + | |
| 1718 | + | |
| 1719 | + | |
| 1720 | + | |
| 1721 | + | |
| 1722 | + | |
| 1723 | + | |
| 1724 | + | |
| 1725 | + | |
| 1726 | + | |
| 1727 | + | |
| 1728 | + | |
| 1729 | + | |
| 1730 | + | |
| 1731 | + | |
| 1732 | + | |
| 1733 | + | |
| 1734 | + | |
| 1735 | + | |
| 1736 | + | |
| 1737 | + | |
| 1738 | + | |
| 1739 | + | |
| 1740 | + | |
| 1741 | + | |
| 1742 | + | |
| 1743 | + | |
| 1744 | + | |
| 1745 | + | |
| 1746 | + | |
| 1747 | + | |
| 1748 | + | |
| 1749 | + | |
| 1750 | + | |
| 1751 | + | |
| 1752 | + | |
| 1753 | + | |
| 1754 | + | |
| 1755 | + | |
| 1756 | + | |
| 1757 | + | |
| 1758 | + | |
| 1759 | + | |
| 1760 | + | |
| 1761 | + | |
| 1762 | + | |
| 1763 | + | |
| 1764 | + | |
| 1765 | + | |
| 1766 | + | |
| 1767 | + | |
| 1768 | + | |
| 1769 | + | |
| 1770 | + | |
| 1771 | + | |
| 1772 | + | |
| 1773 | + | |
| 1774 | + | |
| 1775 | + | |
| 1776 | + | |
| 1777 | + | |
| 1778 | + | |
| 1779 | + | |
| 1780 | + | |
| 1781 | + | |
| 1782 | + | |
| 1783 | + | |
| 1784 | + | |
| 1785 | + | |
| 1786 | + | |
| 1787 | + | |
| 1788 | + | |
| 1789 | + | |
| 1790 | + | |
| 1791 | + | |
| 1792 | + | |
| 1793 | + | |
| 1794 | + | |
| 1795 | + | |
| 1796 | + | |
| 1797 | + | |
| 1798 | + | |
| 1799 | + | |
| 1800 | + | |
1715 | 1801 | | |
1716 | 1802 | | |
0 commit comments