Split grouped quantize/activations and dbias for faster compilation on multicore machines#2983
Split grouped quantize/activations and dbias for faster compilation on multicore machines#2983ptrendx wants to merge 3 commits into
Conversation
Signed-off-by: Przemek Tredak <ptredak@nvidia.com>
44a7d09 to
30880ef
Compare
for more information, see https://pre-commit.ci
|
/te-ci |
Greptile SummaryThis PR splits large CUDA source files containing grouped quantize/activation and dbias functions into smaller, separate compilation units to improve parallel compilation speed on multicore machines. No functional logic is changed — every function body is reproduced verbatim in its new file.
Confidence Score: 5/5Safe to merge — purely a mechanical file-splitting refactor with no behavioral changes. Every moved function body is bit-for-bit identical to the removed code; CMakeLists.txt correctly registers all 12 new files in the right lists; the explicit ptx.cuh include in common.cuh hardens an existing implicit dependency; and activation/glu.cu, which has no grouped or dbias variants, is correctly left untouched. No files require special attention. Important Files Changed
Reviews (1): Last reviewed commit: "[pre-commit.ci] auto fixes from pre-comm..." | Re-trigger Greptile |
Description
The actual right fix would be to nvRTC them, but this will at least make it slightly more manageable.
Type of change