[slimtensor] Enable CUDA tensor copy #16800

pytorchbot · 2026-01-22T21:36:07Z

This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #16771 by @Gasoonjia
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/gasoonjia/111/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/gasoonjia/111/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/gasoonjia/110/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/gasoonjia/111/orig
Differential Revision: D91202900
@diff-train-skip-merge

Pull Request resolved: #16769 This diff adds CUDA storage infrastructure to SlimTensor, enabling GPU memory allocation and management. **Key changes:** 1. **`cuda/Guard.h`** - CUDAGuard RAII class: - Saves current CUDA device on construction, restores on destruction - Exception-safe device context switching - Constructors accept device index or Device object 2. **`core/Storage.h`** - Extended for CUDA support: - Added `DeviceTraits<DeviceType::CUDA>` specialization with: - `allocate()` - Uses cudaMalloc with CUDAGuard for device selection - `free()` - Uses cudaFree with warning on error - `memcpy()` - Supports Host↔Device and Device↔Device copies - Added `DEFAULT_CUDA_DEVICE` constant - Updated `MaybeOwningStorage` constructor to handle CUDA devices - Stub implementation when `CUDA_AVAILABLE` is not defined (throws error) ghstack-source-id: 335102161 @exported-using-ghexport Differential Revision: [D91202899](https://our.internmc.facebook.com/intern/diff/D91202899/)

Pull Request resolved: #16770 This diff enables CUDA tensor creation with basic tensor functionality and factory function support **Key changes:* 1. **`core/SlimTensor.h`** - Extended for CUDA support: - Added `is_cuda()` method to check if tensor is on CUDA device 2. **`factory/Empty.h`** - Supports CUDA: - `empty_strided()` and `empty()` work with CUDA device via `new_storage()` - Device routing is handled by `MaybeOwningStorage` constructor ghstack-source-id: 335102160 @exported-using-ghexport Differential Revision: [D91202897](https://our.internmc.facebook.com/intern/diff/D91202897/)

Pull Request resolved: #16771 This diff enables CUDA tensor copy operations in SlimTensor. **Key changes:** **`core/SlimTensor.h`** - Extended for CUDA support: - Updated `copy_()` to handle cross-device copies: - CPU→CUDA (cudaMemcpyHostToDevice) - CUDA→CPU (cudaMemcpyDeviceToHost) - CUDA→CUDA (cudaMemcpyDeviceToDevice, same device) - Cross-device copies require contiguous tensors - CPU-to-CPU copies continue to support non-contiguous (strided) tensors ghstack-source-id: 335102159 @exported-using-ghexport Differential Revision: [D91202900](https://our.internmc.facebook.com/intern/diff/D91202900/)

pytorch-bot · 2026-01-22T21:36:13Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16800

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 91 Pending

As of commit ffb18f0 with merge base 8ab593b ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-01-22T22:19:07Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Gasoonjia added 3 commits January 22, 2026 09:55

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 22, 2026

Gasoonjia approved these changes Jan 22, 2026

View reviewed changes

Base automatically changed from gh/gasoonjia/110/orig to main January 22, 2026 22:15

Merge branch 'main' into gh/gasoonjia/111/orig

ffb18f0

Gasoonjia merged commit 37aa87e into main Jan 22, 2026
126 of 130 checks passed

Gasoonjia deleted the gh/gasoonjia/111/orig branch January 22, 2026 22:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[slimtensor] Enable CUDA tensor copy #16800

[slimtensor] Enable CUDA tensor copy #16800

Uh oh!

pytorchbot commented Jan 22, 2026

Uh oh!

pytorch-bot bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[slimtensor] Enable CUDA tensor copy #16800

[slimtensor] Enable CUDA tensor copy #16800

Uh oh!

Conversation

pytorchbot commented Jan 22, 2026

Uh oh!

pytorch-bot bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16800

⏳ No Failures, 91 Pending

Uh oh!

github-actions bot commented Jan 22, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot bot commented Jan 22, 2026 •

edited

Loading

This PR needs a `release notes:` label