Conversation
Apply QuantizeInputs and QuantizeOutputs passes in the Cortex-M compilation path to strip the float-in/float-out wrapper from quantized models. This produces a fully int8 model that accepts and returns int8 tensors directly. The passes are applied after to_edge_transform_and_lower but before CortexMPassManager, since the latter renames quantized_decomposed ops to cortex_m variants which the I/O passes cannot recognize.
Add a self-contained HTML graph visualizer as devtools/visualization/html_visualization.py, complementing the existing Model Explorer-based visualization. Generates interactive Cytoscape.js HTML files from .pt2, .pte, .etrecord, and multi-pass trace .json files with no server or external dependencies required. Key changes from the original repo-root visualize_graph.py: - Fix broken ETRecord import (executorch.sdk -> executorch.devtools.etrecord) and rewrite extract_from_etrecord to use correct ETRecord attributes (edge_dialect_program, graph_map) instead of non-existent graph_module - Replace Cortex-M-specific "cortex_m" category with generic "backend" category and configurable _BACKEND_OP_PREFIXES for all backends - Merge duplicate extract_from_pte / extract_from_pte_enhanced into one function with bounds checking and generic delegate blob analysis - Add escapeHtml to single-pass HTML template (XSS fix) - Fix O(n*m) edge filter to O(n) set lookup - Remove dead code (extract_delegated_graph, unreachable PTE branches) - Replace Arm-specific extract_arm_delegate_info with backend-agnostic _extract_delegate_blob_info - Make __init__.py imports from visualization_utils conditional so html_visualization works without model_explorer installed The old visualize_graph.py becomes a thin deprecation shim. Authored with Claude.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18027
Note: Links to docs will display an error until the docs builds have been completed. ❌ 5 New FailuresAs of commit 191ef83 with merge base 0907294 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
There was a problem hiding this comment.
Pull request overview
This PR introduces a self-contained HTML visualization module for ExecuTorch graphs at devtools/visualization/html_visualization.py, supporting .pt2, .pte, .etrecord, and multi-pass trace .json files. It also adds a Cortex-M pass tracing script, a deprecated shim at the repo root, and unrelated quantized I/O changes to the ARM compiler.
Changes:
- New
devtools/visualization/html_visualization.pymodule providing interactive HTML graph visualization using Cytoscape.js, with support for multiple file formats and multi-pass trace views - New
trace_cortex_m_passes.pyscript at the repo root that traces the Cortex-M compilation pipeline pass-by-pass and generates JSON snapshots - Changes to
examples/arm/aot_arm_compiler.pyaddingQuantizeInputs/QuantizeOutputspasses for fully int8 I/O on Cortex-M
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
devtools/visualization/html_visualization.py |
New HTML visualization module with graph extraction from multiple formats and self-contained HTML generation |
devtools/visualization/__init__.py |
Updated to export html_visualization functions and wrap model_explorer imports in try/except |
devtools/visualization/TARGETS |
Added html_visualization.py source and etrecord dependency |
visualize_graph.py |
New deprecated shim at repo root re-exporting from html_visualization |
trace_cortex_m_passes.py |
New script at repo root for tracing Cortex-M compilation passes |
examples/arm/aot_arm_compiler.py |
Added QuantizeInputs/QuantizeOutputs passes for int8 I/O in cortex-m path |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from executorch.exir.pass_base import ExportPass | ||
| from executorch.exir.program._program import _transform | ||
| from torch.export import export | ||
| from torchao.quantization.pt2e.export_utils import model_is_exported |
There was a problem hiding this comment.
model_is_exported is imported but never used in this file. This will fail linting (F401 unused import). Remove this import.
| """Trace the Cortex-M compilation pipeline pass-by-pass, capturing graph snapshots. | ||
|
|
||
| Runs quantization, export, to_edge, then each CortexMPassManager pass individually, | ||
| saving a JSON file with per-pass graph snapshots for use with visualize_graph.py. | ||
|
|
||
| Usage: | ||
| python3 trace_cortex_m_passes.py --model mobilenet_v2 -o mv2_trace.json | ||
|
|
||
| Authored with Claude. | ||
| """ |
There was a problem hiding this comment.
Both trace_cortex_m_passes.py and visualize_graph.py are placed at the repository root, which is unconventional for this project. The repo root contains build/setup scripts (setup.py, install_requirements.py), but domain-specific tools like these belong under devtools/ or examples/. The trace_cortex_m_passes.py script is Cortex-M/ARM specific and would be better placed under examples/arm/ or backends/cortex_m/scripts/, and visualize_graph.py (as a deprecated shim) clutters the root. Consider moving them to more appropriate locations.
| for i, idx in enumerate(plan.inputs): | ||
| node_id = f"input_{i}" | ||
| val = plan.values[idx].val | ||
| details = {"value_index": idx} | ||
| if isinstance(val, Tensor): | ||
| details["shape"] = list(val.sizes) | ||
| details["dtype"] = ( | ||
| val.scalar_type.name | ||
| if hasattr(val.scalar_type, "name") | ||
| else str(val.scalar_type) | ||
| ) | ||
|
|
||
| nodes.insert( | ||
| 0, | ||
| { | ||
| "id": node_id, | ||
| "label": f"input_{i}", | ||
| "w": 70, | ||
| "category": "placeholder", | ||
| "op_name": "input", | ||
| "details": details, | ||
| }, | ||
| ) | ||
| value_producers[idx] = node_id |
There was a problem hiding this comment.
The input node loop (lines 322-345) sets value_producers[idx] = node_id after the kernel/delegate loop (lines 229-320). This means when the kernel loop checks if val_idx in value_producers (line 279), input value indices won't be in value_producers yet, so edges from inputs to the first kernel nodes that consume them will be missing. The input processing should be moved before the kernel/delegate loop, or a separate pre-pass should populate value_producers with input indices first.
| # Import here so it's available for the isinstance check in the pass loop | ||
| from executorch.backends.arm._passes import FoldAndAnnotateQParamsPass # noqa: E402 |
There was a problem hiding this comment.
FoldAndAnnotateQParamsPass is used inside run_pipeline() on line 384, but its import is placed here after the function definition. While this works at runtime (since the import executes before run_pipeline is called), it is a maintenance hazard—the code reads as if the name is undefined at the point of use, and any refactoring that calls run_pipeline at import time would break. Move this import to the top of the file alongside the other executorch.backends.arm imports.
| CATEGORY_COLORS = { | ||
| "backend": "#4caf50", | ||
| "aten_compute": "#2196f3", | ||
| "quantize": "#ff9800", | ||
| "memory": "#9e9e9e", | ||
| "placeholder": "#03a9f4", | ||
| "param": "#78909c", | ||
| "delegate": "#ab47bc", | ||
| } | ||
|
|
||
|
|
||
| def categorize_node(op_name: str) -> str: | ||
| name = op_name.lower() | ||
| if "cortex_m" in name: | ||
| return "backend" | ||
| if any( | ||
| k in name | ||
| for k in ( | ||
| "quantize_per_tensor", | ||
| "dequantize_per_", | ||
| "quantize_per_channel", | ||
| "dequantize_per_channel", | ||
| ) | ||
| ): | ||
| return "quantize" | ||
| if any( | ||
| k in name | ||
| for k in ( | ||
| "view", | ||
| "clone", | ||
| "permute", | ||
| "slice", | ||
| "copy", | ||
| "expand", | ||
| "reshape", | ||
| "t_copy", | ||
| "unsqueeze", | ||
| "squeeze", | ||
| ) | ||
| ): | ||
| return "memory" | ||
| if any(k in name for k in ("placeholder", "output", "getitem", "get_attr")): | ||
| return "placeholder" | ||
| if "delegate" in name: | ||
| return "delegate" | ||
| return "aten_compute" | ||
|
|
||
|
|
||
| def _make_label(op_name: str) -> str: | ||
| name = op_name.split("::")[-1] if "::" in op_name else op_name | ||
| if "." in name: | ||
| name = name.rsplit(".", 1)[0] | ||
| if len(name) > 30: | ||
| name = name[:27] + "..." | ||
| return name |
There was a problem hiding this comment.
CATEGORY_COLORS, categorize_node, _make_label, and extract_from_exported_program are all duplicated from devtools/visualization/html_visualization.py. Since this PR introduces the canonical html_visualization module, these functions should be imported from there rather than redefined. This will avoid the two copies drifting apart (e.g., html_visualization.categorize_node already uses the more generic _BACKEND_OP_PREFIXES tuple while this copy only checks for "cortex_m").
| def generate_html(graph_data: dict, output_path: str) -> None: | ||
| html = HTML_TEMPLATE | ||
| html = html.replace("$$MODEL_NAME$$", graph_data["metadata"]["model_name"]) | ||
| html = html.replace("$$GRAPH_JSON$$", json.dumps(graph_data)) | ||
| html = html.replace("$$COLORS_JSON$$", json.dumps(CATEGORY_COLORS)) | ||
| with open(output_path, "w") as f: | ||
| f.write(html) | ||
| print( | ||
| f"Wrote {output_path} " | ||
| f"({graph_data['metadata']['total_nodes']} nodes, " | ||
| f"{len(graph_data['edges'])} edges)" | ||
| ) | ||
|
|
||
|
|
||
| def generate_multi_pass_html(trace_data: dict, output_path: str) -> None: | ||
| model_name = trace_data.get("model_name", "unknown") | ||
| passes = trace_data["passes"] | ||
|
|
||
| html = MULTI_PASS_HTML_TEMPLATE | ||
| html = html.replace("$$MODEL_NAME$$", model_name) | ||
| html = html.replace("$$PASSES_JSON$$", json.dumps(passes)) | ||
| html = html.replace("$$COLORS_JSON$$", json.dumps(CATEGORY_COLORS)) |
There was a problem hiding this comment.
The model_name is inserted into the HTML via html.replace("$$MODEL_NAME$$", model_name) without HTML-escaping. In the multi-pass template, the $$MODEL_NAME$$ placeholder appears inside a <span> element (line 673 of the template), which means a model name containing HTML characters (e.g., from a maliciously crafted trace JSON file loaded via extract_from_trace_json) could inject arbitrary HTML/JS. Consider HTML-escaping model_name before substitution, e.g., using html.escape(model_name).
| def categorize_node(op_name: str) -> str: | ||
| name = op_name.lower() | ||
| if any(prefix in name for prefix in _BACKEND_OP_PREFIXES): | ||
| return "backend" | ||
| if any( | ||
| k in name | ||
| for k in ( | ||
| "quantize_per_tensor", | ||
| "dequantize_per_", | ||
| "quantize_per_channel", | ||
| "dequantize_per_channel", | ||
| ) | ||
| ): | ||
| return "quantize" | ||
| if any( | ||
| k in name | ||
| for k in ( | ||
| "view", | ||
| "clone", | ||
| "permute", | ||
| "slice", | ||
| "copy", | ||
| "expand", | ||
| "reshape", | ||
| "t_copy", | ||
| "unsqueeze", | ||
| "squeeze", | ||
| ) | ||
| ): | ||
| return "memory" | ||
| if any(k in name for k in ("placeholder", "output", "getitem", "get_attr")): | ||
| return "placeholder" | ||
| if "delegate" in name: | ||
| return "delegate" | ||
| return "aten_compute" | ||
|
|
||
|
|
||
| def _make_label(op_name: str) -> str: | ||
| name = op_name.split("::")[-1] if "::" in op_name else op_name | ||
| if "." in name: | ||
| name = name.rsplit(".", 1)[0] | ||
| if len(name) > 30: | ||
| name = name[:27] + "..." | ||
| return name | ||
|
|
There was a problem hiding this comment.
This module has no unit tests. The existing visualization_utils_test.py covers the model_explorer-based visualization but there is no test file for html_visualization. At minimum, extract_from_exported_program, categorize_node, generate_html, and generate_multi_pass_html should have tests to prevent regressions.
| # Strip the float I/O wrapper from the quantized model to produce | ||
| # fully int8 inputs and outputs. This must run before CortexMPassManager | ||
| # which renames quantized_decomposed ops to cortex_m variants. | ||
| if args.quantize: | ||
| print("Applying passes to create a fully int8 quantized model...") | ||
|
|
||
| edge = edge.transform([ | ||
| QuantizeInputs(edge, [0]), | ||
| QuantizeOutputs(edge, [0]), | ||
| ]) |
There was a problem hiding this comment.
The changes in this file (adding QuantizeInputs/QuantizeOutputs passes) are about producing fully int8 quantized I/O for Cortex-M models, which appears unrelated to the PR title "Visualization html module". Consider splitting this into a separate PR to keep the visualization module changes focused and reviewable independently.
| # fully int8 inputs and outputs. This must run before CortexMPassManager | ||
| # which renames quantized_decomposed ops to cortex_m variants. | ||
| if args.quantize: | ||
| print("Applying passes to create a fully int8 quantized model...") |
There was a problem hiding this comment.
This uses print() while the surrounding code in to_edge_cortex_m consistently uses logging.info()/logging.warning() (see lines 809, 814, 825). Use logging.info() instead of print() to maintain consistency.
Add devtools/visualization/trace_passes.py — a generic version of
trace_cortex_m_passes.py that works with any ExecuTorch backend.
Traces quantization, export, to_edge, then each backend pass individually,
producing a JSON file with per-pass graph snapshots for visualization with
html_visualization.py.
Supports 5 backends out of the box:
- cortex_m: full pass-by-pass tracing (8 passes)
- xnnpack: full pass-by-pass tracing (16 passes)
- cadence: full pass-by-pass tracing
- vulkan: export/edge stages (no static pass list)
- qnn: export/edge stages (no static pass list)
New backends can be added by calling register_backend() with a BackendConfig
specifying the quantizer class, pass list source, and edge compile config.
Usage:
python -m executorch.devtools.visualization.trace_passes \
--backend xnnpack --model mobilenet_v2 -o trace.json
python -m executorch.devtools.visualization.html_visualization \
trace.json -o trace.html
Authored with Claude.
Self-contained, interactive graph visualization for ExecuTorch
Generates a single HTML file you can open in any browser to explore your model's computation graph with
full interactivity: click nodes for details, zoom/pan, color-coded categories, and pass-by-pass diffing.
Supported formats
┌────────────────────┬───────────────────────────────────────────────────────┐
│ Format │ What you see │
├────────────────────┼───────────────────────────────────────────────────────┤
│ .pt2 │ Pre-serialization ExportedProgram graph │
├────────────────────┼───────────────────────────────────────────────────────┤
│ .pte │ Post-serialization execution plan with delegate info │
├────────────────────┼───────────────────────────────────────────────────────┤
│ .etrecord │ Full graph including pre-delegation stages │
├────────────────────┼───────────────────────────────────────────────────────┤
│ .json (multi-pass) │ Step-through each compiler pass with node count diffs │
└────────────────────┴───────────────────────────────────────────────────────┘
Usage
CLI:
python -m executorch.devtools.visualization.html_visualization model.pte -o graph.html
python -m executorch.devtools.visualization.html_visualization trace.json -o passes.html
Python API:
from executorch.devtools.visualization import (
generate_html,
visualize_edge_manager,
extract_from_pte,
)
In your export script, before to_executorch():
visualize_edge_manager(edge_manager, "my_model.html")
Or from a .pte file:
graph_data = extract_from_pte("model.pte")
generate_html(graph_data, "model.html")
Features
etc.), not just Cortex-M
Delegate
compound nodes, and error banners for failed passes
controls
Complements existing tooling
This sits alongside the existing devtools/visualization Model Explorer integration. Use Model Explorer
for live interactive server-based exploration; use this for offline, shareable, self-contained HTML
snapshots.