-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Improve how WebGPU compute shaders spread iterations across threads #8690
Description
Increasing access
Currently, if you make a compute shader that runs, say, one million iterations, and you do compute(myShader, 1000000), it will likely still be laggy because it is not fully utilizing the GPU. If you do compute(myShader, 1000, 1000), spreading it across both x and y, it generally runs a lot better. (There is also a z axis we can spread across, too!) This is somewhat unintuitive, and requires you to manually compute the array index you want in your compute shader from your x and y (and maybe z) coordinates. I've done this 1D-problem-spread-across-2D technique here: https://editor.p5js.org/davepagurek/sketches/n1q_WYNrE
Ideally, this is something we can do under the hood for you so that you can focus on the core pedagogical problem of compute shaders, which is thinking about your loop in parallel.
Most appropriate sub-area of p5.js?
- Accessibility
- Color
- Core/Environment/Rendering
- Data
- DOM
- Events
- Image
- IO
- Math
- Typography
- Utilities
- WebGL
- Build process
- Unit testing
- Internationalization
- Friendly errors
- Other (specify if possible)
Feature enhancement details
We currently somewhat arbitrarily pick the number of work groups to run the compute shader on. We can do something a little smarter there, maybe trying to make it as square or cubelike as possible.
The main requirement is that we also need to present the shader hook with an index in the original coordinates, not our modified one. So if the user calls compute(myShader, 1000000), even if we convert this into a 1000x1000 square, the index we give to the shader should be a vector of size [1000000, 1, 1], not [1000, 1000, 1].