[Release/3.3] Fix out-of-bounds memory access in SetKernel for 0-size tensor by hushenwei2000 · Pull Request #78536 · PaddlePaddle/Paddle

hushenwei2000 · 2026-03-31T10:56:18Z

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

修复 0-Size 报错问题

paddle.Tensor.set_	accuracy	CPU	精度不对	paddle.Tensor.set_(Tensor([20],"complex64"), Tensor([0, 3],"complex64"), list[20,], list[2,], 0, ) paddle.Tensor.set_(Tensor([20],"float32"), Tensor([0, 3],"float32"), list[20,], list[2,], 0, )

Summary

Fix GPU out-of-bounds memory access (detected by compute-sanitizer) when calling Tensor.set_(source, shape, stride, offset) with a 0-size source tensor and non-zero target shape.
When source.numel() == 0, the output tensor now inherits the source's 0-size dims/strides instead of using user-specified shape/stride, preventing invalid memory access.
Updated existing TestSet_API_ZeroSize and added 5 new test cases covering various 0-size scenarios.

Root Cause

In SetKernel (paddle/phi/kernels/set_kernel.cc), the conditional logic for handling 0-size tensors had a missing branch: when source.numel() == 0 and x.numel() != 0, no branch was executed. The output tensor retained its original data holder, but SetInferMeta had already set its meta to the user-specified shape/stride (e.g., shape=[20], stride=[2]). When ContiguousKernel later attempted to read 20 elements via stride=2 (requiring storage for indices 0–38), the underlying storage was empty (nullptr), causing CUDA illegal memory access.

Before fix (missing branch):

if source.numel() != 0:    # False (source is 0-size)
    ...
elif x.numel() == 0:        # False (x has 20 elements)
    ...

Neither branch executes → out keeps stale holder with mismatched meta

Fix

When source.numel() == 0, force the output tensor's dims/strides to match the source's 0-size shape, and rebind the holder to the source's (empty) storage. This ensures numel == 0, so downstream kernels (e.g., ContiguousKernel) skip safely via their numel <= 0 early-return guards.

Test Plan

Verified fix with compute-sanitizer: ERROR SUMMARY: 0 errors (was 11 errors before fix)
- paddle.Tensor.set_(Tensor([20],"float32"), Tensor([0, 3],"float32"), list[20,], list[2,], 0, )
Added 5 new test cases in TestSet_API_ZeroSize:
- test_zero_size_source_with_nonzero_shape — 0-size source + explicit non-zero shape
- test_zero_size_source_default_args — 0-size source with default shape/stride
- test_zero_size_x_nonzero_source — 0-size x with non-zero source
- test_both_zero_size — both x and source are 0-size
- test_zero_size_source_no_crash_on_contiguous — no crash on .contiguous() after set_
APITest 中现有 set_ 全部测试

paddle.Tensor.set_(Tensor([20],"float64"), Tensor([0, 3],"float64"), list[20,], list[2,], 0, )
paddle.Tensor.set_(Tensor([20],"float64"), Tensor([15, 0],"float64"), list[20,], list[2,], 0, )
paddle.Tensor.set_(Tensor([3, 0],"float16"), Tensor([6, 0],"float16"), list[3,8,], list[2,2,], 0, )
paddle.Tensor.set_(Tensor([3, 8],"float16"), Tensor([0, 3],"float16"), list[3,8,], list[2,2,], 0, )
paddle.Tensor.set_(Tensor([3, 8],"float16"), Tensor([6, 0],"float16"), list[3,8,], list[2,2,], 0, )
paddle.Tensor.set_(Tensor([20],"complex64"), Tensor([0, 3],"complex64"), list[20,], list[2,], 0, )
paddle.Tensor.set_(Tensor([20],"float32"), Tensor([0, 3],"float32"), list[20,], list[2,], 0, )
paddle.Tensor.set_(Tensor([20],"complex64"), Tensor([15, 0],"complex64"), list[20,], list[2,], 0, )
paddle.Tensor.set_(Tensor([20],"float32"), Tensor([15, 0],"float32"), list[20,], list[2,], 0, )

是否引起精度变化

否

When calling Tensor.set_(source, shape, stride, offset) with a 0-size source tensor and non-zero target shape, the original code had a missing branch in the conditional logic: when source.numel()==0 and x.numel()!=0, no branch was executed, leaving `out` with its original data holder but with the user-specified meta (shape/stride). This caused ContiguousKernel to read beyond allocated memory when converting the strided tensor to contiguous. The fix forces the output tensor to inherit the source's 0-size dims/strides when source has no elements, preventing out-of-bounds access. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

paddle-bot · 2026-03-31T10:56:23Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

codecov-commenter · 2026-03-31T15:34:44Z

Codecov Report

❌ Patch coverage is 94.59459% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (release/3.3@88e1968). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
paddle/phi/kernels/set_kernel.cc	94.59%	2 Missing ⚠️

Additional details and impacted files

@@              Coverage Diff               @@
##             release/3.3   #78536   +/-   ##
==============================================
  Coverage               ?   94.59%           
==============================================
  Files                  ?        1           
  Lines                  ?       37           
  Branches               ?        0           
==============================================
  Hits                   ?       35           
  Misses                 ?        2           
  Partials               ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

hushenwei2000 · 2026-04-01T02:32:08Z

/re-run all-failed

hushenwei2000 · 2026-04-01T03:58:41Z

/re-run all-failed

wanghuancoder

LGTM

hushenwei2000 and others added 6 commits March 31, 2026 18:51

fix(phi): fix set_kernel access memory error when tensor is empty

8600d0a

test(inplace): fix assert of 0-size tensor set_ behaviour

5d374bc

fix(phi): fix CPU

b5abc74

test(inplace): add 0-dim tensor set to non-0-size tensor tests

3f5f959

fix(set): handle zero-element output in inplace operation

e6cda03

wanghuancoder approved these changes Apr 1, 2026

View reviewed changes

sneaxiy approved these changes Apr 3, 2026

View reviewed changes

sneaxiy merged commit 67d72e1 into PaddlePaddle:release/3.3 Apr 3, 2026
142 of 153 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Release/3.3] Fix out-of-bounds memory access in SetKernel for 0-size tensor#78536

[Release/3.3] Fix out-of-bounds memory access in SetKernel for 0-size tensor#78536
sneaxiy merged 6 commits intoPaddlePaddle:release/3.3from
hushenwei2000:release33_fix/set-kernel-zero-size-oob

hushenwei2000 commented Mar 31, 2026

Uh oh!

paddle-bot Bot commented Mar 31, 2026

Uh oh!

codecov-commenter commented Mar 31, 2026

Uh oh!

hushenwei2000 commented Apr 1, 2026

Uh oh!

hushenwei2000 commented Apr 1, 2026

Uh oh!

wanghuancoder left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

hushenwei2000 commented Mar 31, 2026

PR Category

PR Types

Description

Summary

Root Cause

Fix

Test Plan

是否引起精度变化

Uh oh!

paddle-bot Bot commented Mar 31, 2026

Uh oh!

codecov-commenter commented Mar 31, 2026

Codecov Report

Uh oh!

hushenwei2000 commented Apr 1, 2026

Uh oh!

hushenwei2000 commented Apr 1, 2026

Uh oh!

wanghuancoder left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants