Arm backend: Make aot_arm_compiler.py functions importable by martinlsm · Pull Request #18039 · pytorch/executorch

martinlsm · 2026-03-10T13:27:47Z

Calibration was tied to the ImageNet evaluator, so when the evaluation
feature was removed from aot_arm_compiler.py, custom calibration of a
quantized model was no longer possible.

This patch reimplements quantization calibration into
aot_arm_compiler.py, but it is instead controlled by a flag called
--calibration_data. This flag can target a file that is either a common
image format, a serialized file (.pt), or a directory which will make
the program search for applicable files within the directory.

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

pytorch-bot · 2026-03-10T13:27:51Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18039

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 182 Pending

As of commit fde836e with merge base 76dfb19 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

martinlsm · 2026-03-10T13:28:00Z

@pytorchbot label ciflow/trunk

pytorch-bot · 2026-03-10T13:28:06Z

To add these label(s) (ciflow/trunk) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

martinlsm · 2026-03-10T13:28:10Z

@pytorchbot label "partner: arm"

martinlsm · 2026-03-10T13:28:19Z

@pytorchbot label "release notes: none"

Copilot

Pull request overview

This PR refactors examples/arm/aot_arm_compiler.py to make its public functions importable by an upcoming model evaluation program. It replaces the args namespace parameter with explicit individual parameters in several functions, introduces a QuantMode enum to replace the boolean is_int16x8 flag, and uses the leading underscore convention to categorize functions as public (importable) or private (internal).

Changes:

Introduced a QuantMode enum (INT8, A16W8) to replace the is_int16x8 boolean flag, and updated quantize() and quantize_model() to use it.
Renamed internal functions (get_args, get_compile_spec, save_bpte_program, to_edge_TOSA_delegate, to_edge_no_delegate, to_edge_cortex_m) with leading underscores to mark them private, while keeping importable functions (quantize, quantize_model, get_model_and_inputs_from_name, dump_delegation_info) public.
Refactored _to_edge_TOSA_delegate, _to_edge_no_delegate, and quantize_model to accept explicit parameters instead of the args namespace, and moved compile_spec construction and quant_mode determination into the __main__ block.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-10T13:32:47Z

examples/arm/aot_arm_compiler.py

+def _to_edge_no_delegate(
    exported_program: ExportedProgram,
-    args,
+    compile_spec,
    model: GraphModule,
+    quant_mode: Optional[QuantMode],
    example_inputs: Tuple[torch.Tensor],
+    model_name: str,
+    strict_export: bool,
 ):
    model_quant = None
-    if args.quantize:
+    if quant_mode is not None:
        # As we can target multiple output encodings, one must
        # be specified.
-        compile_spec = get_compile_spec(args)
        model, exported_program = quantize_model(
-            args, model, example_inputs, compile_spec
+            model,
+            example_inputs,
+            compile_spec,
+            model_name,
+            strict_export,
+            quant_mode,
        )
        model_quant = model



_to_edge_no_delegate no longer receives args as a parameter (it was replaced by individual parameters), but the function body at line 815 still references args via _apply_replace_quant_nodes(edge, args). This will raise NameError when the function is called from an importing module rather than from __main__. Similar to _to_edge_TOSA_delegate, the required args fields (target and direct_drive, as used by _apply_replace_quant_nodes) need to be passed explicitly as parameters.

Copilot · 2026-03-10T13:32:47Z

examples/arm/aot_arm_compiler.py

+    match quant_mode:
+        case QuantMode.INT8:
+            operator_config = get_symmetric_quantization_config(is_per_channel=True)
+        case QuantMode.A16W8:
+            if compile_specs.tosa_spec.support_extension("int16"):
+                operator_config = get_symmetric_a16w8_quantization_config(
+                    is_per_channel=True
+                )
+            else:
+                raise ValueError(
+                    f"Context TOSA spec {compile_specs.tosa_spec} doesn't support int16"
+                )


The match statement does not have a default/wildcard case. If a new QuantMode variant is added in the future, operator_config will be unbound at line 264, causing an UnboundLocalError. Consider adding a wildcard case that raises a ValueError with a descriptive message (e.g., case _: raise ValueError(f"Unsupported quant_mode: {quant_mode}")).

Copilot · 2026-03-10T13:32:48Z

examples/arm/aot_arm_compiler.py

+def _to_edge_TOSA_delegate(
    exported_program: ExportedProgram,
-    args,
+    compile_spec,
    model: GraphModule,
+    quant_mode: Optional[QuantMode],
    example_inputs: Tuple[torch.Tensor],
+    model_name: str,
+    strict_export: bool,
 ):
-    # As we can target multiple output encodings, one must
-    # be specified.
-    compile_spec = get_compile_spec(args)
-
    model_quant = None
-    if args.quantize:
+    if quant_mode is not None:
        model_quant, exported_program = quantize_model(
-            args, model, example_inputs, compile_spec
+            model,
+            example_inputs,
+            compile_spec,
+            model_name,
+            strict_export,
+            quant_mode,
        )

    partitioner = create_partitioner(compile_spec)


_to_edge_TOSA_delegate no longer receives args as a parameter (it was replaced by individual parameters), but the function body at line 714 still references args via _apply_replace_quant_nodes(edge, args). This will resolve to the module-level args global when run as __main__, but will raise NameError when imported and called from another module — which is the use case this PR is enabling. The args dependency needs to be replaced, e.g., by passing the required values (args.target and args.direct_drive per _apply_replace_quant_nodes) explicitly as parameters to this function.

The model evaluation feature, i.e. to compute a model's top-1/top-5 accuracy, is to be moved into a new Python program. Some functions in aot_arm_compiler will then be needed to be imported by the evaluation program (to get a similar compilation flow and to reuse existing and working code). Make the functions in aot_arm_compiler "importable" by no longer passing in program args to them; the importing program will not have any such args. Also consistently categorize public/private functions and variables with the "leading underscore convention". Signed-off-by: Martin Lindström <Martin.Lindstroem@arm.com> Change-Id: Iaf2a2661956fe1d122d6ef2bbe023e6b8ddc501c

If the output directory does not exist, aot_arm_compiler will now create the output directory before putting the output files there. Signed-off-by: Martin Lindström <Martin.Lindstroem@arm.com> Change-Id: I91374cc3e7105da7d22327ac85b052c406665369

martinlsm · 2026-03-11T11:43:05Z

@pytorchbot label ciflow/trunk

pytorch-bot · 2026-03-11T11:43:11Z

To add these label(s) (ciflow/trunk) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

Calibration was tied to the ImageNet evaluator, so when the evaluation feature was removed from aot_arm_compiler.py, custom calibration of a quantized model was no longer possible. This patch reimplements quantization calibration into aot_arm_compiler.py, but it is instead controlled by a flag called --calibration_data. This flag can target a file that is either a common image format, a serialized file (.pt), or a directory which will make the program search for applicable files within the directory. Signed-off-by: Martin Lindström <Martin.Lindstroem@arm.com> Change-Id: I06f8b15b850ba8aff037b7ec617c194b1760a494

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-11T11:50:26Z

examples/arm/aot_arm_compiler.py

+def _to_edge_no_delegate(
    exported_program: ExportedProgram,
-    args,
+    compile_spec,
    model: GraphModule,
+    quant_mode: Optional[QuantMode],
    example_inputs: Tuple[torch.Tensor],
+    model_name: str,
+    strict_export: bool,
+    calibration_samples: Optional[List[Tuple[torch.Tensor, ...]]],
 ):


_to_edge_no_delegate() no longer takes args, but the body still relies on args later (via _apply_replace_quant_nodes(edge, args)). If this module is imported and the function is called without the main path having created a module-global args, this will fail. Pass the needed values explicitly (either add args back or pass the specific fields into _apply_replace_quant_nodes).

Copilot · 2026-03-11T11:50:26Z

examples/arm/aot_arm_compiler.py

+        help=(
+            "Optional calibration data file or directory. If a directory is "
+            "provided, up to 1000 samples are used for calibration. "
+            "Supported files: .pt/.pth. If not provided,"


In the --calibration_data help text, the string concatenation is missing a space after "If not provided," so the rendered help becomes "If not provided,quantized...". Add a trailing/leading space to one of the adjacent string literals.

Suggested change

"Supported files: .pt/.pth. If not provided,"

"Supported files: .pt/.pth. If not provided, "

Copilot · 2026-03-11T11:50:26Z

examples/arm/aot_arm_compiler.py

    model: GraphModule,
    model_name: str,
    compile_specs: ArmCompileSpec,
    example_inputs: Tuple[torch.Tensor],


Type hint for example_inputs in quantize() is Tuple[torch.Tensor], which denotes a 1-element tuple. This code supports multiple inputs (and calibration_samples uses Tuple[torch.Tensor, ...]), so this should be Tuple[torch.Tensor, ...] (and ideally kept consistent across quantize_model/to_edge* signatures).

Suggested change

example_inputs: Tuple[torch.Tensor],

example_inputs: Tuple[torch.Tensor, ...],

Copilot · 2026-03-11T11:50:27Z

examples/arm/aot_arm_compiler.py

+        if calibration_samples is None:
+            calibration_samples = [example_inputs]
+
+        for sample in calibration_samples:


Cortex-M calibration loop uses calibration_samples directly without validating that each sample matches example_inputs (length/shape/type). If a sample is malformed, this will fail with a less actionable exception from prepared(*...). Consider reusing _validate_calibration_sample() here as well for consistent, clearer errors.

Suggested change

for sample in calibration_samples:

for sample in calibration_samples:

_validate_calibration_sample(sample, example_inputs)

Copilot · 2026-03-11T11:50:27Z

examples/arm/aot_arm_compiler.py

+            else:
+                raise ValueError(
+                    f"Context TOSA spec {compile_specs.tosa_spec} doesn't support int16"
+                )


quantize(): the match quant_mode: block has no default case. If a caller passes an unexpected value (e.g., a plain string), operator_config will remain unset and the function will fail later with UnboundLocalError at quantizer.set_global(). Add an explicit default/else branch that raises a ValueError for unsupported quant modes.

Suggested change

)

)

case _:

raise ValueError(f"Unsupported quantization mode: {quant_mode}")

zingo · 2026-03-11T14:57:45Z

Fail with trunk / test-mcu-cortex-m-backend / linux-job (push) is unrelated, that test sometimes times out on main branch testing also.

martinlsm requested a review from digantdesai as a code owner March 10, 2026 13:27

Copilot AI review requested due to automatic review settings March 10, 2026 13:27

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 10, 2026

pytorch-bot bot added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Mar 10, 2026

Copilot started reviewing on behalf of martinlsm March 10, 2026 13:28 View session

pytorch-bot bot added the release notes: none Do not include this in the release notes label Mar 10, 2026

Sebastian-Larsson added the ciflow/trunk label Mar 10, 2026

Copilot AI reviewed Mar 10, 2026

View reviewed changes

Martin Lindström added 2 commits March 11, 2026 07:33

martinlsm force-pushed the marlin-importable branch from 63b46ff to 7097c88 Compare March 11, 2026 06:34

pytorch-bot bot removed the ciflow/trunk label Mar 11, 2026

zingo approved these changes Mar 11, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings March 11, 2026 11:45

Copilot started reviewing on behalf of martinlsm March 11, 2026 11:46 View session

Sebastian-Larsson added the ciflow/trunk label Mar 11, 2026

Copilot AI reviewed Mar 11, 2026

View reviewed changes

Merge branch 'main' into marlin-importable

fde836e

	"Supported files: .pt/.pth. If not provided,"
	"Supported files: .pt/.pth. If not provided, "

	example_inputs: Tuple[torch.Tensor],
	example_inputs: Tuple[torch.Tensor, ...],

	for sample in calibration_samples:
	for sample in calibration_samples:
	_validate_calibration_sample(sample, example_inputs)

-                )
+                )
+        case _:
+            raise ValueError(f"Unsupported quantization mode: {quant_mode}")

Conversation

martinlsm commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18039

⏳ No Failures, 182 Pending

Uh oh!

martinlsm commented Mar 10, 2026

Uh oh!

pytorch-bot bot commented Mar 10, 2026

Uh oh!

martinlsm commented Mar 10, 2026

Uh oh!

martinlsm commented Mar 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

martinlsm commented Mar 11, 2026

Uh oh!

pytorch-bot bot commented Mar 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

zingo commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

martinlsm commented Mar 10, 2026 •

edited

Loading

pytorch-bot bot commented Mar 10, 2026 •

edited

Loading