feat: Expose accumulation mode flag via Conv2dInfo by Sqvid · Pull Request #1289 · ARM-software/ComputeLibrary

Sqvid · 2026-05-12T11:14:29Z

Users of the functional and experimental operator convolution APIs, e.g, arm_compute::NEGEMMConv2d, or arm_compute::experimental::op::CpuGemmDirectConv2d, can make use of fp32 accumulation by setting this flag in Conv2dInfo during the validate() and configure() steps.

Commit 5e40456 changed the default behaviour of CpuGemmDirectConv2d to accumulate in f32 unless enable_fast_math was set. However, this can produce regressions for users expecting the old behaviour. This change exposes the flag to user directly, making fp32 accumulation opt-in.

Change-Id: I3203bdbbfa5152a64438941dd138bab6feb1cec2

cc: @morgolock @gunes-arm

Users of the functional and experimental operator convolution APIs, e.g, arm_compute::NEGEMMConv2d, or arm_compute::experimental::op::CpuGemmDirectConv2d, can make use of fp32 accumulation by setting this flag in Conv2dInfo during the validate() and configure() steps. Commit 5e40456 changed the default behaviour of CpuGemmDirectConv2d to accumulate in f32 unless enable_fast_math was set. However, this can produce regressions for users expecting the old behaviour. This change exposes the flag to user directly, making fp32 accumulation opt-in. Change-Id: I3203bdbbfa5152a64438941dd138bab6feb1cec2 Signed-off-by: Siddhartha Menon <siddhartha.menon@arm.com>

gunes-arm · 2026-05-12T15:40:10Z

    bool                enable_fast_math{false};
    unsigned int        num_groups{1};
    WeightsInfo         weights_info{};
+    bool                use_fp32_acc{false};


Can you add an inline comment saying
// Relevant only for Fp16

Sqvid force-pushed the conv2d-acc-mode branch from cb5c54d to 2d8f125 Compare May 12, 2026 11:33

Sqvid changed the title ~~feat: expose accumulation mode flag via Conv2dInfo~~ feat: Expose accumulation mode flag via Conv2dInfo May 12, 2026

Sqvid force-pushed the conv2d-acc-mode branch from 2d8f125 to fa8c826 Compare May 12, 2026 15:33

Sqvid mentioned this pull request May 12, 2026

AArch64+SVE correctness failure in conv ( indirect_gemm:acl ) seen in pytorch unit test uxlfoundation/oneDNN#5106

Open

gunes-arm reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Expose accumulation mode flag via Conv2dInfo#1289

feat: Expose accumulation mode flag via Conv2dInfo#1289
Sqvid wants to merge 1 commit into
ARM-software:mainfrom
Sqvid:conv2d-acc-mode

Sqvid commented May 12, 2026 •

edited

Loading

Uh oh!

gunes-arm May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Sqvid commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gunes-arm May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sqvid commented May 12, 2026 •

edited

Loading