Fix NHWC im2row output layout in HiFi backend to match Python reference#19681
Fix NHWC im2row output layout in HiFi backend to match Python reference#19681wl1026sun wants to merge 1 commit into
Conversation
Summary: The Cadence `im2row` operator's NHWC code paths in the HiFi backend produced an output layout inconsistent with the operator's Python reference implementation and with its own NCHW counterpart in the same file. **Contract (channel-major, `[c][kp]` per output position):** `ref_implementations.py` defines `cadence::im2row` via `torch.nn.functional.unfold`, producing `(N, num_patches, C*kH*kW)` with channel as the OUTER index. The HiFi NCHW runtime already matched this. **Pre-fix bug (kp-major, `[kp][c]`):** - HiFi NHWC template path in `op_im2row_out.cpp` (used for float inputs). - HiFi NHWC SIMD path in `xa_nn_im2row.c` (used for int8/uint8 inputs via `xa_nn_im2row_quantized`). **Fix:** 1. `op_im2row_out.cpp` — rewrite NHWC branch to write `[c][kp]`. 2. `xa_nn_im2row.c` — rewrite NHWC branch in the SIMD path to write `[c][kp]`. 3. `replace_ops.py` — add a `channel_last=True` guard in the AOT pass that elides `cadence::im2row` to `view_copy`. The elision was valid only under the pre-fix `[kp][c]` semantics (where NHWC im2row could be expressed as a no-op view of NHWC input); under the corrected layout, NHWC im2row actually rearranges data, so the elision would drop a required computation. Differential Revision: D96846886
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19681
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ You can merge normally! (1 Unrelated Failure)As of commit 78595aa with merge base 3d86cc7 ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@wl1026sun has exported this pull request. If you are a Meta employee, you can view the originating Diff in D96846886. |
This PR needs a
|
Summary:
The Cadence
im2rowoperator's NHWC code paths in the HiFi backend producedan output layout inconsistent with the operator's Python reference
implementation and with its own NCHW counterpart in the same file.
Contract (channel-major,
[c][kp]per output position):ref_implementations.pydefinescadence::im2rowviatorch.nn.functional.unfold, producing(N, num_patches, C*kH*kW)withchannel as the OUTER index. The HiFi NCHW runtime already matched this.
Pre-fix bug (kp-major,
[kp][c]):op_im2row_out.cpp(used for float inputs).xa_nn_im2row.c(used for int8/uint8 inputs viaxa_nn_im2row_quantized).Fix:
op_im2row_out.cpp— rewrite NHWC branch to write[c][kp].xa_nn_im2row.c— rewrite NHWC branch in the SIMD path to write[c][kp].replace_ops.py— add achannel_last=Trueguard in the AOT pass thatelides
cadence::im2rowtoview_copy. The elision was valid only underthe pre-fix
[kp][c]semantics (where NHWC im2row could be expressed asa no-op view of NHWC input); under the corrected layout, NHWC im2row
actually rearranges data, so the elision would drop a required
computation.
Differential Revision: D96846886