Fix nvbug6084457: Make NVLINK_MAX_LINKS version-dependent by mdboom · Pull Request #2192 · NVIDIA/cuda-python

mdboom · 2026-06-10T17:54:00Z

NVLINK_MAX_LINKS was updated in CTK 13.3. We therefore need to make this value dynamic based on the NVML version.

rwgk · 2026-06-10T18:55:18Z


-NVLINK_MAX_LINKS = 18
+
+if tuple(int(x) for x in system_get_nvml_version().split(".")) < (3, 13):


Could we obtain this value from the source of truth?

E.g. I see:

$ grep NVLINK_MAX_LINKS /usr/local/cuda_*/include/nvml.h /usr/local/cuda_13.3.0_610.43.02_linux_kitpick035/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 36 //!< Maximum number of NVLink links supported.

I have every single /usr/local/cuda-*.* from 12.0 to 13.3, but I only see NVML_NVLINK_MAX_LINKS in the 13.3 include file.

Maybe something like this could work?

int internal_only_get_NVLINK_MAX_LINKS() { #ifdef NVML_NVLINK_MAX_LINKS return NVML_NVLINK_MAX_LINKS; #else return 18; // Prior to CUDA 13.3 this value was hard-wired. #endif }

If there isn't a practical solution, could you please add a comment to explain?

I have every single /usr/local/cuda-. from 12.0 to 13.3, but I only see NVML_NVLINK_MAX_LINKS in the 13.3 include file.

Are you sure? I see it in 12.9 through 13.3 (though the value changed in 13.3).

We have to build a single binary that works for every version 12.9 - 13.3, so the only way to do this is with a runtime computation. Additionally, we build without the nvml.h header present. I'll add a comment.

Oh, sorry, I messed up like this (retrieved from my bash history; note the cuda_):

grep NVLINK_MAX_LINKS /usr/local/cuda_*/include/nvml.h

(I have a custom softlink for 13.3, which is why that matched.)

It looks much different like this:

$ grep NVLINK_MAX_LINKS /usr/local/cuda-*/include/nvml.h /usr/local/cuda-12.0/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 18 /usr/local/cuda-12.1/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 18 /usr/local/cuda-12.2/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 18 /usr/local/cuda-12.3/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 18 /usr/local/cuda-12.4/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 18 /usr/local/cuda-12.5/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 18 /usr/local/cuda-12.6/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 18 /usr/local/cuda-12.8/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 18 /usr/local/cuda-12.9/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 18 /usr/local/cuda-13.0/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 18 /usr/local/cuda-13.1/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 18 //!< Maximum number of NVLink links supported. /usr/local/cuda-13.2/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 18 //!< Maximum number of NVLink links supported. /usr/local/cuda-13.3/include/nvml.h:#define NVML_NVLINK_MAX_LINKS 36 //!< Maximum number of NVLink links supported.

I didn't think it through before, but at second look, of course the helper function I was envisioning would need to live in nvml.h itself.

I had my agent do a quick check in CUDA 13.3's nvml.h and the NVML docs. It didn't surface any runtime C API that reports NVML_NVLINK_MAX_LINKS. Your solution does indeed appear to be our only option. Ideally we'd request a runtime API, but that's for another day.

This is a module-level constant so there's not much we can do with a helper function. @mdboom that said, this would force nvml to be loaded at import time, which breaks CPU-only envs. Can we hide this behind module __getattr__

(For example this would break the import test.)

Experimentally, it doesn't break CPU-only builds, but it does break on systems without NVML installed, so I agree the __getattr__ trick is probably justified, but it's narrower than you think. nvmlSystemGetVersion doesn't require nvmlInit and doesn't require a GPU.

CPU-only envs have a broad definition, including but not limited to: GPU driver is not installed 😉

rwgk · 2026-06-12T08:24:19Z


-NVLINK_MAX_LINKS = 18
+
+if tuple(int(x) for x in system_get_nvml_version().split(".")) < (3, 13):


I didn't think it through before, but at second look, of course the helper function I was envisioning would need to live in nvml.h itself.

I had my agent do a quick check in CUDA 13.3's nvml.h and the NVML docs. It didn't surface any runtime C API that reports NVML_NVLINK_MAX_LINKS. Your solution does indeed appear to be our only option. Ideally we'd request a runtime API, but that's for another day.

…namic

…vlink-max-links-dynamic

github-actions · 2026-06-12T19:13:02Z

Doc Preview CI
🚀 View preview at https://nvidia.github.io/cuda-python/pr-preview/pr-2192/
https://nvidia.github.io/cuda-python/pr-preview/pr-2192/cuda-core/
https://nvidia.github.io/cuda-python/pr-preview/pr-2192/cuda-bindings/
https://nvidia.github.io/cuda-python/pr-preview/pr-2192/cuda-pathfinder/
Preview will be ready when the GitHub Pages deployment is complete.

mdboom · 2026-06-12T19:50:59Z

@leofang: There seem to be failures on specific hardware (a100, t4) on Windows only that we can't use all of the stated nvlinks. I have asked internally why this might be the case. It /shouldn't/ be a driver vs. CTK difference since NVML ships with the driver and we are asking NVML (not the CTK or cuda-bindings itself or something) what version we have to determine how many links we have. But check my work, it could be that I'm not checking the appropriate thing.

Fix nvbug6084457: Make NVLINK_MAX_LINKS version-dependent

002ce1a

mdboom added this to the cuda.bindings 13.3.1 milestone Jun 10, 2026

mdboom self-assigned this Jun 10, 2026

mdboom added bug Something isn't working cuda.bindings Everything related to the cuda.bindings module labels Jun 10, 2026

github-actions Bot added the cuda.core Everything related to the cuda.core module label Jun 10, 2026

mdboom modified the milestones: cuda.bindings 13.3.1, cuda.bindings next Jun 10, 2026

rwgk reviewed Jun 10, 2026

View reviewed changes

mdboom and others added 2 commits June 11, 2026 12:26

Add comment

801ecb7

Merge branch 'main' into nvlink-max-links-dynamic

3889de5

mdboom requested a review from rwgk June 11, 2026 19:13

rwgk approved these changes Jun 12, 2026

View reviewed changes

mdboom added 5 commits June 12, 2026 10:16

Use a __getattr__ approach

192ac61

Merge remote-tracking branch 'upstream/main' into nvlink-max-links-dy…

237c7e2

…namic

Merge remote-tracking branch 'origin/nvlink-max-links-dynamic' into n…

51c5dce

…vlink-max-links-dynamic

Update .pyi files

260c34f

Fix test

62f83f4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix nvbug6084457: Make NVLINK_MAX_LINKS version-dependent#2192

Fix nvbug6084457: Make NVLINK_MAX_LINKS version-dependent#2192
mdboom wants to merge 8 commits into
NVIDIA:mainfrom
mdboom:nvlink-max-links-dynamic

mdboom commented Jun 10, 2026

Uh oh!

rwgk Jun 10, 2026

Uh oh!

mdboom Jun 11, 2026 •

edited

Loading

Uh oh!

rwgk Jun 12, 2026

Uh oh!

rwgk Jun 12, 2026

Uh oh!

leofang Jun 12, 2026

Uh oh!

leofang Jun 12, 2026

Uh oh!

mdboom Jun 12, 2026

Uh oh!

leofang Jun 12, 2026

Uh oh!

rwgk Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

mdboom commented Jun 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		NVLINK_MAX_LINKS = 18

		if tuple(int(x) for x in system_get_nvml_version().split(".")) < (3, 13):

Conversation

mdboom commented Jun 10, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mdboom Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 12, 2026

Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

mdboom commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mdboom Jun 11, 2026 •

edited

Loading

mdboom commented Jun 12, 2026 •

edited

Loading