Open
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This works as follows:
CudaBuilderas normal.build.rs, just after building their PTX, they will:cuda_builder::cg::CooperativeGroupsinstance,-arch=sm_*and so on),.compile(..), which will spit out a fully linkedcubin,launch!to schedule their GPU work, they will now uselaunch_cooperative!.todo
cuLaunchCooperativeKernelin a nice interface. We can add the cooperative multi device bits later, along with all of the other bits from the cooperative API.