Skip to content

ci: add aarch64 build for nemotron-ocr wheel#1663

Open
jdye64 wants to merge 4 commits intomainfrom
feat/ocr-arm64-build
Open

ci: add aarch64 build for nemotron-ocr wheel#1663
jdye64 wants to merge 4 commits intomainfrom
feat/ocr-arm64-build

Conversation

@jdye64
Copy link
Collaborator

@jdye64 jdye64 commented Mar 19, 2026

Summary

  • The nemotron_ocr_cpp C++ extension was only built on x86_64 GitHub Actions runners, producing a wheel that fails with ModuleNotFoundError: No module named 'nemotron_ocr_cpp._nemotron_ocr_cpp' on ARM hosts (e.g. NVIDIA DGX Spark).
  • Converts the build_ocr_cuda job to a matrix strategy that builds on both ubuntu-latest (x86_64) and ubuntu-24.04-arm (aarch64), using the same nvidia/cuda:13.0.0-devel-ubuntu24.04 container image (published for both architectures).
  • Artifact names are suffixed with the architecture (dist-nemotron-ocr-v1-x86_64 / dist-nemotron-ocr-v1-aarch64) to avoid upload collisions.

Test plan

  • Trigger the workflow manually with upload_to: none and verify both matrix legs (x86_64 and aarch64) complete successfully
  • Inspect the uploaded artifacts to confirm each wheel contains the correct platform-tagged .so (e.g. _nemotron_ocr_cpp.cpython-312-aarch64-linux-gnu.so)
  • Install the aarch64 wheel on a DGX Spark and verify from nemotron_ocr_cpp._nemotron_ocr_cpp import * succeeds

Made with Cursor

The nemotron_ocr_cpp C++ extension was only built on x86_64 runners,
causing ModuleNotFoundError on ARM hosts like the DGX Spark. Matrix
the build_ocr_cuda job across x86_64 (ubuntu-latest) and aarch64
(ubuntu-24.04-arm) so pip can install the correct platform wheel on
either architecture.

Made-with: Cursor
@jdye64 jdye64 requested a review from a team as a code owner March 19, 2026 19:21
@jdye64 jdye64 requested a review from jperez999 March 19, 2026 19:21
jdye64 added 3 commits March 19, 2026 15:33
The upstream nemotron-ocr build script hard-codes -mavx2 (an x86-only
SIMD flag) in its C++ extension compile args.  On aarch64 runners this
causes an immediate compile failure.

Add a generic --strip-cflag option to nightly_build_publish.py that
removes quoted occurrences of a given compiler flag from upstream
Python build scripts after cloning.  The workflow now passes
--strip-cflag=-mavx2 when building on aarch64.

Made-with: Cursor
The upstream nemotron-ocr build script already checks ARCH to skip
x86-only SIMD flags (-mavx2) on ARM.  Pass ARCH=arm64 via --build-env
on the aarch64 leg instead of our regex-based --strip-cflag approach.

This removes the _strip_cflags machinery in favour of the upstream's
own escape hatch.

Made-with: Cursor
Change the dev suffix from YYYYMMDD to YYYYMMDDHHmmss so multiple
builds on the same calendar day each receive a unique, monotonically
increasing version that PyPI will accept.

The legacy NIGHTLY_DATE_YYYYMMDD env var override is still honoured;
a new NIGHTLY_DATE_SUFFIX var is also supported.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants