fix causal_softmax op to adapt with ascend by ShaneWoof · Pull Request #1280 · InfiniTensor/InfiniCore

ShaneWoof · 2026-06-13T07:46:09Z

修改 causal_softmax 算子在昇腾 Ascend 上对 x tensor 非默认 batch stride 和 BF16 dtype 进行适配。

修改内容：修改 Infinicore/src/infiniop/ops/causal_softmax/ascend/causal_softmax_ascend.cc
1、新增 BF16 值张量支持；
2、引入连续 temp 缓冲（batch×seq×total_seq_len），每条 calculate 将 x 逐行拷贝到 temp 后在 temp 上完成 masked_fill 和 softmax，最终写入 y，绕过 aclnnInplaceMaskedFillTensor 无法处理 x_tensor 非默认 batch stride 的限制。

现状： infiniop算子测试全部通过；infinicore算子接口测试跑通，53/63 Passed
（1）infiniop
测试样例：

测试结果：

（2）infinicore
测试样例：

测试结果：

修改 causal_softmax 算子在昇腾 Ascend 上对 x tensor 非默认 batch stride 和 BF16 dtype 进行适配。修改内容：修改 Infinicore/src/infiniop/ops/causal_softmax/ascend/causal_softmax_ascend.cc 1、新增 BF16 值张量支持； 2、引入连续 temp 缓冲（batch×seq×total_seq_len），每条 calculate 将 x 逐行拷贝到 temp 后在 temp 上完成 masked_fill 和 softmax，最终写入 y，绕过 aclnnInplaceMaskedFillTensor 无法处理 x_tensor 非默认 batch stride 的限制现状： infiniop算子测试全部通过；infinicore算子接口测试跑通，53/63 Passed

ShaneWoof requested a review from a team June 13, 2026 07:46

format causal_softmax_ascend.cc

a6b6804

wooway777 approved these changes Jun 13, 2026

View reviewed changes

wooway777 merged commit 4a177ab into InfiniTensor:main Jun 13, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix causal_softmax op to adapt with ascend#1280

fix causal_softmax op to adapt with ascend#1280
wooway777 merged 2 commits into
InfiniTensor:mainfrom
ShaneWoof:fix/causal_softmax_ascend

ShaneWoof commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ShaneWoof commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants