Feature/rvv support by Sherlockzhangjinge · Pull Request #4359 · alibaba/MNN

Sherlockzhangjinge · 2026-04-08T11:31:15Z

Description

Develop the RVV accelerated version of the methods below:
MNNAvgPoolInt8
MNNFloat2Int8
MNNInt8ScaleToFloat
MNNLineDepthWiseInt8AddBiasScaleUnit
MNNMaxPoolInt8
MNNReluWithSlopeChannelInt8

Module

CPU

Type

Signed-off-by: typer-J <2236066784@qq.com> Signed-off-by: Sherlockzhangjinge <zjgzhangjinge@outlook.com> Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn>

ADD missing semicolon

wangzhaode

Code Review

🔴 Blocking Issues

1. 代码风格检查不通过（CRLF 换行 / 缩进混用 / 未使用代码）

多处违反项目 clang-format 规范（.clang-format: UseTab: Never, IndentWidth: 4）：

MNNAvgPoolInt8.cpp, MNNInt8ScaleToFloat.cpp, MNNLineDepthWiseInt8AddBiasScaleUnit.cpp, MNNMaxPoolInt8.cpp, MNNReluWithSlopeChannelInt8.cpp — 全部使用 CRLF (\r\n) 换行，与项目其余文件的 LF 换行不一致，会导致 clang-format 和 diff 工具报错
Int8FunctionsOpt.cpp 第 2586 行 — 使用 tab 缩进而非 4 空格，与同文件其他行风格不一致
MNNReluWithSlopeChannelInt8.cpp 第 13-18 行 — 定义了 ALIMAX/ALIMIN 宏但全文未使用，属于残留代码
MNNFloat2Int8.cpp 第 44 行 — 行尾存在 trailing whitespace

修复建议：运行 clang-format -i -style=file 统一格式化，并将 5 个新文件的换行符转为 LF。移除未使用的宏定义。

2. vfcvt.x.f 截断取整 vs 参考实现 roundf() 四舍五入不一致

文件: MNNFloat2Int8.cpp:68, MNNLineDepthWiseInt8AddBiasScaleUnit.cpp:63, MNNReluWithSlopeChannelInt8.cpp:67

RVV 的 vfcvt.x.f.v 指令是向零截断取整，而参考实现（Int8FunctionsOpt.cpp:1710, 1669）使用 roundf() 做四舍五入。例如 2.7f 参考得 3，RVV 得 2；-2.7f 参考得 -3，RVV 得 -2。这会导致量化精度偏差，推理结果与 CPU 参考路径不一致。

修复建议：在 vfcvt 前加 +/-0.5f 偏移模拟四舍五入。正数路径：vfcvt(v + 0.5f)；负数路径：vfcvt(v - 0.5f)。可借助 vmfgt mask + vmerge 实现分支。

3. MNNFloat2Int8.cpp — vsetvlmax 返回值不确定，小 VLEN 设备可能越界

文件: MNNFloat2Int8.cpp, 第 34 行

vsetvlmax 取决于硬件 VLEN。VLEN=128 的最小配置下，e32m2 的 vlmax 仅为 2。后续用 vid % 4 构造 channel index 并通过 vloxei 从 scale[4] gather，当 vl_template < 4 时无法覆盖 4 个 channel，scale/zero 模板不完整。

修复建议：添加 vl_template < 4 的 fallback 标量路径，或改用 e32m1 类型并确保 vlmax >= 4，否则回退到参考实现。

Reviewed by qwen3.6-plus-preview

wangzhaode

Review Summary

This PR adds RVV (RISC-V Vector) accelerated implementations for 6 Int8 quantization functions: MNNAvgPoolInt8, MNNFloat2Int8, MNNInt8ScaleToFloat, MNNLineDepthWiseInt8AddBiasScaleUnit, MNNMaxPoolInt8, and MNNReluWithSlopeChannelInt8. The RVV intrinsics usage looks correct overall.

Blocking Issues

Two CI checks are currently failing and must be fixed before merge:

Commit Message Format: Two commits do not follow the required [Module:Type] Description format:
- 670284be — implement some methods in Int8FunctionsOpt.cpp with RVV
- 5f6bce67 — Fix duplicate assignment for MNNMaxPoolInt8
- Please squash/reword to match the format, e.g. [CPU:Feature] Add RVV accelerated Int8 functions
Changed Lines Format (clang-format): Multiple files have formatting issues detected by CI:
- source/backend/cpu/compute/Int8FunctionsOpt.cpp: extern declarations exceed line width and need wrapping; line 2586 uses a tab character instead of spaces for indentation
- source/backend/cpu/riscv/rvv/MNNReluWithSlopeChannelInt8.cpp: macro definitions and function signature formatting do not conform to clang-format
- Other new .cpp files likely have similar issues
- Fix: run clang-format -i -style=file on all changed files

Minor Suggestions

The Chinese comment // ---- 循环外预构造周期向量 ---- and // ---- 主循环：纯向量运算，无标量介入 ---- in MNNFloat2Int8.cpp — consider using English comments for consistency with the rest of the codebase.

@Sherlockzhangjinge Could you please fix the above formatting and commit message issues? Thanks!

Reviewed by qwen3.6-plus-preview

…thods by rvv; Fix missing simicolon and format problem Signed-off-by: typer-J <2236066784@qq.com> Signed-off-by: Sherlockzhangjinge <zjgzhangjinge@outlook.com> Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn>

Signed-off-by: typer-J <2236066784@qq.com> Signed-off-by: Sherlockzhangjinge <zjgzhangjinge@outlook.com> Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn>

wangzhaode

Summary

This PR adds RVV (RISC-V Vector Extension) accelerated implementations for 6 Int8 quantization operators: MNNAvgPoolInt8, MNNFloat2Int8, MNNInt8ScaleToFloat, MNNLineDepthWiseInt8AddBiasScaleUnit, MNNMaxPoolInt8, and MNNReluWithSlopeChannelInt8. It also applies clang-format to Int8FunctionsOpt.cpp.

Blocking issues

1. MNNInt8ScaleToFloat_RVV — scale/zero vector length mismatch bug

File: source/backend/cpu/riscv/rvv/MNNInt8ScaleToFloat.cpp, lines 15–37
v_scale and v_zero are initialized with vl=4 (only 4 elements written), but the main loop uses a much larger vl (e.g., VLMAX_e32m4 = 16 when VLEN=128). Elements beyond index 3 are undefined, producing incorrect results.
Suggested fix: Use the same vloxei32 + index-modulo pattern as MNNFloat2Int8_RVV to replicate the 4-element scale/zero pattern across the full vector length.

2. Missing #include for struct type definitions — compile error

MNNLineDepthWiseInt8AddBiasScaleUnit.cpp uses QuanPostTreatParameters without including its definition.
MNNReluWithSlopeChannelInt8.cpp uses QuanPrePostParameters without including its definition.
The existing MNNGemmInt8AddBiasScale_16x4_Unit_RVV.cpp correctly includes "../../compute/Int8FunctionsOpt.h" — the new files should do the same.

Minor suggestions

Rounding mode: RVV vfcvt_x_f defaults to round-to-nearest-even, while the reference C code uses roundf() (round-half-away-from-zero). Consider using __riscv_vfcvt_x_f_v_i32m4_rm(v, __RISCV_FRM_RMM, vl) if exact parity is needed.
MNNFloat2Int8.cpp is missing a trailing newline.
The bulk of Int8FunctionsOpt.cpp changes (~500 deleted lines) are clang-format reformatting — consider splitting into a separate commit for easier review.

Reviewed by qwen3.6-plus-preview

Sherlockzhangjinge and others added 3 commits April 8, 2026 19:07

implement some methods in Int8FunctionsOpt.cpp with RVV

670284b

Signed-off-by: typer-J <2236066784@qq.com> Signed-off-by: Sherlockzhangjinge <zjgzhangjinge@outlook.com> Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn>

Merge branch 'alibaba:master' into master

0a30a37

Fix duplicate assignment for MNNMaxPoolInt8

5f6bce6

ADD missing semicolon

wangzhaode self-assigned this Apr 9, 2026

wangzhaode reviewed Apr 9, 2026

View reviewed changes

wangzhaode reviewed Apr 13, 2026

View reviewed changes

Sherlockzhangjinge added 2 commits April 15, 2026 12:56

[Core:Feature] Add new optimization layer implement some optimized me…

6c0b2d3

…thods by rvv; Fix missing simicolon and format problem Signed-off-by: typer-J <2236066784@qq.com> Signed-off-by: Sherlockzhangjinge <zjgzhangjinge@outlook.com> Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn>

[CPU:Feature] fix format problem

323af24

Signed-off-by: typer-J <2236066784@qq.com> Signed-off-by: Sherlockzhangjinge <zjgzhangjinge@outlook.com> Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn>

Sherlockzhangjinge force-pushed the feature/rvv_support branch from 426b43e to 323af24 Compare April 15, 2026 07:12

wangzhaode reviewed Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/rvv support#4359

Feature/rvv support#4359
Sherlockzhangjinge wants to merge 5 commits intoalibaba:masterfrom
Sherlockzhangjinge:feature/rvv_support

Sherlockzhangjinge commented Apr 8, 2026

Uh oh!

wangzhaode left a comment

Uh oh!

wangzhaode left a comment

Uh oh!

wangzhaode left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Sherlockzhangjinge commented Apr 8, 2026

Description

Module

Type

Uh oh!

wangzhaode left a comment

Choose a reason for hiding this comment

Code Review

🔴 Blocking Issues

Uh oh!

wangzhaode left a comment

Choose a reason for hiding this comment

Review Summary

Blocking Issues

Minor Suggestions

Uh oh!

wangzhaode left a comment

Choose a reason for hiding this comment

Summary

Blocking issues

Minor suggestions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants