Skip to content

Remove extern "C" wrapping and fix format specifiers for ARM embedded builds#18000

Merged
meta-codesync[bot] merged 1 commit intopytorch:mainfrom
Ninja91:export-D95739935
Mar 10, 2026
Merged

Remove extern "C" wrapping and fix format specifiers for ARM embedded builds#18000
meta-codesync[bot] merged 1 commit intopytorch:mainfrom
Ninja91:export-D95739935

Conversation

@Ninja91
Copy link
Contributor

@Ninja91 Ninja91 commented Mar 9, 2026

Summary:
Remove redundant extern "C" blocks wrapping CMSIS-NN header includes and fix -Wformat errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

Context

The extern "C" wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's arm_nn_math_types.h temporarily closes its inner extern "C" before including arm_mve.h, but the outer extern "C" from the op files remains active, forcing arm_mve.h into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (%hhd for ScalarType, %d/%ld for int64_t) cause -Wformat errors treated as build failures on ARM toolchains.

Changes:

  1. Removed extern "C" wrapping from all 13 op .cpp files and cmsis_scratch_buffer_context.h
  2. Consolidated CMSIS-NN includes in cortex_m_ops_common.h — added #include "arm_nnfunctions.h" (without extern "C") so op files get it transitively
  3. Added #include <cinttypes> for PRIi64 macro
  4. Fixed %hhd%d with static_cast<int> for ScalarType values
  5. Fixed %d/%ldPRIi64 for int64_t values

Differential Revision: D95739935

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

Copilot AI review requested due to automatic review settings March 9, 2026 00:17
@Ninja91 Ninja91 requested a review from rascani as a code owner March 9, 2026 00:17
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 9, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18000

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 279 Pending

As of commit e529b28 with merge base dda73d3 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 9, 2026
@meta-codesync
Copy link
Contributor

meta-codesync bot commented Mar 9, 2026

@Ninja91 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D95739935.

@github-actions
Copy link

github-actions bot commented Mar 9, 2026

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Cortex-M backend CMSIS-NN integration to unblock ARM embedded (MVE) builds by removing problematic outer extern "C" wrapping around CMSIS-NN headers and by addressing -Wformat build failures from mismatched format specifiers.

Changes:

  • Removed file-local extern "C" blocks around arm_nnfunctions.h includes across Cortex-M op sources and cmsis_scratch_buffer_context.h.
  • Centralized CMSIS-NN includes in cortex_m_ops_common.h (including arm_nnfunctions.h) for transitive access from op files.
  • Fixed several format specifiers in cortex_m_ops_common.h (ScalarType printing and int64_t via PRIi64), and added <cinttypes>.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
backends/cortex_m/ops/op_transpose.cpp Removes local CMSIS-NN include wrapping (now relies on common header).
backends/cortex_m/ops/op_softmax.cpp Removes local CMSIS-NN include wrapping (now relies on common header).
backends/cortex_m/ops/op_quantized_transpose_conv2d.cpp Removes local CMSIS-NN include wrapping.
backends/cortex_m/ops/op_quantized_mul.cpp Removes local CMSIS-NN include wrapping.
backends/cortex_m/ops/op_quantized_max_pool2d.cpp Removes local CMSIS-NN include wrapping.
backends/cortex_m/ops/op_quantized_linear.cpp Removes local CMSIS-NN include wrapping.
backends/cortex_m/ops/op_quantized_depthwise_conv2d.cpp Removes local CMSIS-NN include wrapping.
backends/cortex_m/ops/op_quantized_conv2d.cpp Removes local CMSIS-NN include wrapping.
backends/cortex_m/ops/op_quantized_avg_pool2d.cpp Removes local CMSIS-NN include wrapping.
backends/cortex_m/ops/op_quantized_add.cpp Removes local CMSIS-NN include wrapping.
backends/cortex_m/ops/op_pad.cpp Removes local CMSIS-NN include wrapping.
backends/cortex_m/ops/op_minimum.cpp Removes local CMSIS-NN include wrapping.
backends/cortex_m/ops/op_maximum.cpp Removes local CMSIS-NN include wrapping.
backends/cortex_m/ops/cortex_m_ops_common.h Adds <cinttypes>, centralizes CMSIS-NN includes, and fixes format specifiers.
backends/cortex_m/ops/cmsis_scratch_buffer_context.h Removes extern "C" wrapping around CMSIS-NN include.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 81 to 86
ET_CHECK_MSG(
multiplier >= std::numeric_limits<int32_t>::min() &&
multiplier <= std::numeric_limits<int32_t>::max(),
"%s multiplier must be in int32 range [Value: %d]",
"%s multiplier must be in int32 range [Value: %" PRIi64 "]",
param_name,
multiplier);
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In validate_single_quant_params, the zero_point parameter is unused. If this header is compiled with -Wall, this can trigger -Wunused-parameter (often fatal with -Werror on embedded toolchains). Consider removing the parameter, marking it ET_UNUSED, or implementing the intended zero-point range validation.

Copilot uses AI. Check for mistakes.
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 9, 2026
… builds (pytorch#18000)

Summary:

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Differential Revision: D95739935
@Ninja91 Ninja91 force-pushed the export-D95739935 branch from a505827 to 75430c0 Compare March 9, 2026 03:43
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 9, 2026
… builds (pytorch#18000)

Summary:

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Differential Revision: D95739935
Copilot AI review requested due to automatic review settings March 9, 2026 05:45
@Ninja91 Ninja91 force-pushed the export-D95739935 branch from 75430c0 to b5b9003 Compare March 9, 2026 05:45
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +81 to 82
(void)zero_point;
ET_CHECK_MSG(
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_single_quant_params takes zero_point but explicitly ignores it. Callers (e.g., validate_quantization_params / ops casting zero_point down to int32_t) therefore don't validate that zero points are within an expected/representable range, and the nearby doc comment claims that zero points are checked. Consider either validating zero_point (e.g., int8 or at least int32 range, consistent with CMSIS/AOT expectations) or removing the parameter/updating the contract to avoid misleading validation.

Suggested change
(void)zero_point;
ET_CHECK_MSG(
ET_CHECK_MSG(
zero_point >= std::numeric_limits<int8_t>::min() &&
zero_point <= std::numeric_limits<int8_t>::max(),
"%s zero_point must be in int8 range [-128, 127] [Value: %" PRIi64 "]",
param_name,
zero_point);
ET_CHECK_MSG(

Copilot uses AI. Check for mistakes.
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 9, 2026
… builds (pytorch#18000)

Summary:

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Differential Revision: D95739935
@Ninja91 Ninja91 force-pushed the export-D95739935 branch from b5b9003 to 7675d3c Compare March 9, 2026 06:32
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 9, 2026
… builds (pytorch#18000)

Summary:
Pull Request resolved: pytorch#18000

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Differential Revision: D95739935
@Ninja91 Ninja91 force-pushed the export-D95739935 branch from 7675d3c to c85be2c Compare March 9, 2026 06:36
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 9, 2026
… builds (pytorch#18000)

Summary:

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Differential Revision: D95739935
Copilot AI review requested due to automatic review settings March 9, 2026 16:37
@Ninja91 Ninja91 force-pushed the export-D95739935 branch from c85be2c to 0f825db Compare March 9, 2026 16:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


#include "cortex_m_ops_common.h"
extern "C" {
#include "arm_nnfunctions.h"
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that cortex_m_ops_common.h includes CMSIS-NN headers, this file’s direct #include "arm_nnfunctions.h" is redundant and also puts CMSIS includes before the common header (which may introduce ordering issues depending on CMSIS header dependencies). Prefer relying on cortex_m_ops_common.h alone, or include it first and drop the standalone CMSIS include.

Suggested change
#include "arm_nnfunctions.h"

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed OOB, we typically follow the Google C++ style guide, which recommends against transitive includes.

Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 9, 2026
… builds (pytorch#18000)

Summary:

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Differential Revision: D95739935
@Ninja91 Ninja91 force-pushed the export-D95739935 branch from 0f825db to ea1bea5 Compare March 9, 2026 16:43
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 9, 2026
… builds (pytorch#18000)

Summary:

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Differential Revision: D95739935
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 10, 2026
… builds (pytorch#18000)

Summary:
Pull Request resolved: pytorch#18000

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Reviewed By: rascani

Differential Revision: D95739935
Copilot AI review requested due to automatic review settings March 10, 2026 02:22
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 10, 2026
… builds (pytorch#18000)

Summary:

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Reviewed By: rascani

Differential Revision: D95739935
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 10, 2026
… builds (pytorch#18000)

Summary:
Pull Request resolved: pytorch#18000

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Reviewed By: rascani

Differential Revision: D95739935
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

backends/cortex_m/ops/cortex_m_ops_common.h:105

  • The doc comment for validate_quantization_params claims it "Checks that zero points fit in int8 range", but validate_single_quant_params currently ignores zero_point entirely ((void)zero_point). Either add the zero_point range validation here or update the comment so it matches the actual checks being performed.
inline void validate_single_quant_params(
    const int64_t zero_point,
    const int64_t multiplier,
    const int64_t shift,
    const char* param_name) {
  (void)zero_point;
  ET_CHECK_MSG(
      multiplier >= std::numeric_limits<int32_t>::min() &&
          multiplier <= std::numeric_limits<int32_t>::max(),
      "%s multiplier must be in int32 range [Value: %" PRIi64 "]",
      param_name,
      multiplier);

  ET_CHECK_MSG(
      shift >= -31 && shift <= 31,
      "%s shift must be in range [-31, 31] [Value: %" PRIi64 "]",
      param_name,
      shift);
}

/**
 * Validate quantization parameters for inputs and output.
 *
 * Checks that zero points fit in int8 range, multipliers fit in int32 range,
 * and shifts are within a valid bit-shift range (0-31).
 *
 * Ensures parameters comply with Ahead-Of-Time (AOT) quantization requirements
 * and CMSIS-NN kernel expectations.
 *
 * Raises errors via ET_KERNEL_CHECK if any check fails.
 */

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +19 to 24
#include <cinttypes>
#include <limits>
#include <optional>

extern "C" {
#include "arm_nn_types.h"
}

Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions consolidating CMSIS-NN includes by adding arm_nnfunctions.h to cortex_m_ops_common.h for transitive inclusion, but cortex_m_ops_common.h currently only includes arm_nn_types.h. Either update the PR description to reflect the current approach (explicit includes per op file) or actually add the include to match the described change.

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +87
const int32_t out_n = input_dims.n + cmsis_pre_pad.n + cmsis_post_pad.n;
const int32_t out_h = input_dims.h + cmsis_pre_pad.h + cmsis_post_pad.h;
const int32_t out_w = input_dims.w + cmsis_pre_pad.w + cmsis_post_pad.w;
const int32_t out_c = input_dims.c + cmsis_pre_pad.c + cmsis_post_pad.c;

const int8_t pad_byte = static_cast<int8_t>(pad_value);
for (int32_t n = 0; n < out_n; ++n) {
for (int32_t h = 0; h < out_h; ++h) {
for (int32_t w = 0; w < out_w; ++w) {
for (int32_t c = 0; c < out_c; ++c) {
const int32_t out_idx = ((n * out_h + h) * out_w + w) * out_c + c;
const int32_t in_n = n - cmsis_pre_pad.n;
const int32_t in_h = h - cmsis_pre_pad.h;
const int32_t in_w = w - cmsis_pre_pad.w;
const int32_t in_c = c - cmsis_pre_pad.c;
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pad_out computes out_n/out_h/out_w/out_c from input_dims and pad values, then writes output_data using those computed extents without verifying that the provided 'out' tensor has matching sizes/numel. If out is incorrectly sized, this becomes an out-of-bounds write. Validate out sizes against the computed padded shape (or derive loop bounds from out.sizes()) before populating output_data.

Copilot uses AI. Check for mistakes.
Comment on lines 78 to 102
const int8_t pad_byte = static_cast<int8_t>(pad_value);
for (int32_t n = 0; n < out_n; ++n) {
for (int32_t h = 0; h < out_h; ++h) {
for (int32_t w = 0; w < out_w; ++w) {
for (int32_t c = 0; c < out_c; ++c) {
const int32_t out_idx = ((n * out_h + h) * out_w + w) * out_c + c;
const int32_t in_n = n - cmsis_pre_pad.n;
const int32_t in_h = h - cmsis_pre_pad.h;
const int32_t in_w = w - cmsis_pre_pad.w;
const int32_t in_c = c - cmsis_pre_pad.c;
if (in_n >= 0 && in_n < input_dims.n && in_h >= 0 &&
in_h < input_dims.h && in_w >= 0 && in_w < input_dims.w &&
in_c >= 0 && in_c < input_dims.c) {
const int32_t in_idx =
((in_n * input_dims.h + in_h) * input_dims.w + in_w) *
input_dims.c +
in_c;
output_data[out_idx] = input_data[in_idx];
} else {
output_data[out_idx] = pad_byte;
}
}
}
}
}
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pad_out replaces arm_pad_s8 with a scalar nested-loop implementation. This is likely slower and increases code size on embedded targets; since the goal is to fix linkage issues (not replace kernels), restore the CMSIS-NN arm_pad_s8 call now that CMSIS headers are included without an outer extern "C" wrapper.

Copilot uses AI. Check for mistakes.
@digantdesai digantdesai added the module: arm Issues related to arm backend label Mar 10, 2026
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 10, 2026
… builds (pytorch#18000)

Summary:

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Reviewed By: rascani

Differential Revision: D95739935
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 10, 2026
… builds (pytorch#18000)

Summary:
Pull Request resolved: pytorch#18000

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Reviewed By: rascani

Differential Revision: D95739935
Copilot AI review requested due to automatic review settings March 10, 2026 17:11
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 10, 2026
… builds (pytorch#18000)

Summary:

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Reviewed By: rascani

Differential Revision: D95739935
Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 10, 2026
… builds (pytorch#18000)

Summary:
Pull Request resolved: pytorch#18000

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Reviewed By: rascani

Differential Revision: D95739935
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Ninja91 added a commit to Ninja91/executorch that referenced this pull request Mar 10, 2026
… builds (pytorch#18000)

Summary:

Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Reviewed By: rascani

Differential Revision: D95739935
… builds

Summary:
Remove redundant `extern "C"` blocks wrapping CMSIS-NN header includes and fix `-Wformat` errors in format specifiers. These changes are required for the CC pipeline's FVP benchmark runner to compile on ARM embedded targets (Cortex-M55/M85 with MVE).

## Context
The `extern "C"` wrapping around CMSIS-NN headers causes ARM embedded builds targeting MVE-capable processors to fail. CMSIS-NN's `arm_nn_math_types.h` temporarily closes its inner `extern "C"` before including `arm_mve.h`, but the outer `extern "C"` from the op files remains active, forcing `arm_mve.h` into C linkage where C++ function overloading is illegal.

Additionally, format specifiers (`%hhd` for `ScalarType`, `%d`/`%ld` for `int64_t`) cause `-Wformat` errors treated as build failures on ARM toolchains.

Changes:
1. Removed `extern "C"` wrapping from all 13 op .cpp files and `cmsis_scratch_buffer_context.h`
2. Consolidated CMSIS-NN includes in `cortex_m_ops_common.h` — added `#include "arm_nnfunctions.h"` (without `extern "C"`) so op files get it transitively
3. Added `#include <cinttypes>` for `PRIi64` macro
4. Fixed `%hhd` → `%d` with `static_cast<int>` for `ScalarType` values
5. Fixed `%d`/`%ld` → `PRIi64` for `int64_t` values

Differential Revision: D95739935
Copilot AI review requested due to automatic review settings March 10, 2026 21:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@meta-codesync meta-codesync bot merged commit 3baf6c2 into pytorch:main Mar 10, 2026
317 of 318 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported module: arm Issues related to arm backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants