[ExecuTorch][WebGPU] Add clone op (aten.clone.default)#20463
[ExecuTorch][WebGPU] Add clone op (aten.clone.default)#20463JulianCloudNTH wants to merge 2 commits into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20463
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 2 Pending, 1 Unrelated FailureAs of commit 3ea2424 with merge base e03f777 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Stack from ghstack (oldest at bottom):
aten.clone.defaultis a pure flat copy on the buffer-only WebGPU backend, identical toview_copy:clone_implreuses the existingadd_flat_copyhelper (output[i] = input[i]) and registers a handler underaten.clone.default. No new shader, generated WGSL header, or CMake source — it shares theview_copyflat-copy compute pipeline. Required for end-to-end Llama 3.2 1B (4-bit, KV cache): the exported model serializes 2aten.clone.defaultops into its runtime operator chain (the RoPE-frequency clones reused across all 16 transformer layers), so without a handler the partition graph-breaks at those nodes. Mirrors the Vulkan delegate, which registers the same op and routes a buffer clone to a flat view-copy.Differential Revision: D109477717