[ez][ET-VK][glsl-codegen] Use mediump precision for half-precision shader variants#19287
[ez][ET-VK][glsl-codegen] Use mediump precision for half-precision shader variants#19287SS-JIA wants to merge 1 commit intogh/SS-JIA/525/basefrom
Conversation
…ader variants
The highp default in gen_vulkan_spv.py blocks Mali GPUs from using FP16 ALU because Mali respects the highp precision contract literally. Adreno silently demotes via Qualcomm's relaxed-precision pass, so it was unaffected, but Mali-G715 was running half-precision shaders at FP32 throughput. The mediump qualifier produces SPIR-V with RelaxedPrecision decorations, which Mali's compiler uses to enable f16 packed math. Note: this is a partial fix — texture-storage shaders still declare local `vec4` working values, so the speedup is bounded; the follow-up is to make `texel_type("half")` return `f16vec4` and enable the FP16 extension on the texture path.
Differential Revision: [D103759541](https://our.internmc.facebook.com/intern/diff/D103759541/)
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19287
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 1 New Failure, 1 Cancelled Job, 2 Unrelated FailuresAs of commit bd5dd01 with merge base a6ee309 ( NEW FAILURE - The following job has failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Stack from ghstack (oldest at bottom):
The highp default in gen_vulkan_spv.py blocks Mali GPUs from using FP16 ALU because Mali respects the highp precision contract literally. Adreno silently demotes via Qualcomm's relaxed-precision pass, so it was unaffected, but Mali-G715 was running half-precision shaders at FP32 throughput. The mediump qualifier produces SPIR-V with RelaxedPrecision decorations, which Mali's compiler uses to enable f16 packed math. Note: this is a partial fix — texture-storage shaders still declare local
vec4working values, so the speedup is bounded; the follow-up is to maketexel_type("half")returnf16vec4and enable the FP16 extension on the texture path.Differential Revision: D103759541