Improve applyCompMatr summation accuracy#781
Conversation
|
How come you separate out the real and imaginary components, mr robo? |
|
Good question. I split them because the dense matrix-vector dot product ultimately has two scalar reductions: one for the real component and one for the imaginary component. Kahan compensation is a scalar summation algorithm, so the complex version here is one compensation stream for I also wanted to keep the compensation step explicit instead of relying on I can factor this into a small local helper if you prefer that shape. |
Closes #598.
Summary
cpu_statevec_anyCtrlAnyTargDenseMatr_sub()with component-wise compensated summation overcpu_qcomp.reandcpu_qcomp.im.amps[i]inside the summation loop.applyCompMatr, which exercises the 3+ targetCompMatrpath rather than the one/two-target specialisations.Notes
I saw the earlier closed attempt in #777. This version keeps the same narrow two-file scope, but uses direct
qrealcompensation for the real and imaginary components instead of relying on complex add/sub overloads in the hot loop. I also checkedbase_qcomp: the current operators are ordinary component-wise arithmetic, so independent real/imaginary Kahan compensation is compatible with the backend representation.Local measurements
Configuration: Windows, GCC 13.2.0, Release, single CPU (
QUEST_ENABLE_OMP=OFF,QUEST_ENABLE_MPI=OFF,QUEST_ENABLE_CUDA=OFF,QUEST_ENABLE_HIP=OFF). The benchmark applies a denseCompMatrwhose first output row is[large+i*large, 1-i, ..., 1-i, -large-i*large]to an all-ones state. The expected first amplitude is(2^targets - 2) - i(2^targets - 2).The measurements show the expected accuracy improvement. The overhead is visible for larger single/double precision reductions, which seems consistent with the tradeoff described in the issue.
Testing
Results:
*applyCompMatr*: passed, 10 test cases / 10,003 assertions.ctest: passed.git diff --check: passed; Git emitted only Windows line-ending conversion warnings.Prepared with AI assistance; I reviewed the patch and ran the listed local checks.