[SYCL] add an e2e test that passes for liboffload with L0#21584
[SYCL] add an e2e test that passes for liboffload with L0#21584againull merged 1 commit intointel:syclfrom
Conversation
2ac3b22 to
187486d
Compare
abfe1b4 to
3c7b065
Compare
| if re.match(r"\[offload:.*", line) and not is_offload_preferred_backend_set: | ||
| if re.match(".*Level[ _]Zero.*", line, re.IGNORECASE): | ||
| offload_assigned_backend = config.backend_to_target["level_zero"] | ||
| is_offload_preferred_backend_set = True | ||
| elif re.match(".*nvidia.*", line, re.IGNORECASE): | ||
| offload_assigned_backend = config.backend_to_target["cuda"] |
There was a problem hiding this comment.
I think we can just assign to config.backend_to_target["offload"] directly and get rid of offload_assigned_backend.
Does the logic here depends on the order of devices/backends in sycl-ls output? I.e. if CUDA comes after L0 in the sycl-ls output, then it will be picked as offload backend. Is this the desired behavior?
There was a problem hiding this comment.
I've been afraid I might have overcomplicated, so I'm glad that you point this out.
L0 backend is preferred, but the outcome is not dependent on the order of the devices listed by sycl-ls.
config.backend_to_target["offload"] might be set to target-nvidia if NVIDIA's device is the first one encountered, but is_offload_preferred_backend_set is not set, therefore the condition is still entered in the next iterations. However, when L0 is found, is_offload_preferred_backend_set is set to true, which prevents from entering the condition in the following iterations. If we start with L0 instead, is_offload_preferred_backend_set would be set before we could find NVIDIA - we would not reassign the preferred backend.
I wanted to prevent setting config.backend_to_target["offload"] to "", which was the case with using only OFFLOAD_BUILD_TARGET, but I am not sure whether preferring L0 over other devices is a strategy which is aligned with our goals. On the other hand, we focus on supporting L0 devices. One thing that concerns me is missing backends - for example, HIP is not supported by offload at this moment. Can we encounter only AMD device in our tests?
There was a problem hiding this comment.
Ok, thanks for clarification. We don't run tests on AMD in our CI,
I'm trying to understand what we are trying to achieve.
So, before this change, if we wanted to run tests on cuda via offload backend, we would do something like this:
llvm-lit --param offload_build_target="target-nvidia" --param sycl_devices="offload:gpu" <repo_path>/sycl/test-e2e
To run on L0 probably this:
llvm-lit --param offload_build_target="target-spirv" --param sycl_devices="offload:gpu" <repo_path>/sycl/test-e2e
At least, if I am reading https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/README.md right.
With this change we can still do the same as above, but additionally if "--param offload_build_target" is not provided, we will try to automatically set it. And as you clarified, on machine with both L0 and cuda it will be set to target-spirv.
I think this approach is fine, at least for now.
There was a problem hiding this comment.
The test would not work without this change when liboffload is built, even with llvm-lit --param offload_build_target="target-nvidia" --param sycl_devices="offload:gpu" <repo_path>/sycl/test-e2e (as I have checked a moment ago). The backend for liboffload would still be undetermined: the tests are not compiled with the OFFLOAD_BUILD_TARGET option, therefore there is only "" left as a value in the dictionary mapping from backends to targets, resulting in a KeyError. Additionally, OFFLOAD_BUILD_TARGET is never mentioned in any CMakeLists.txt, therefore it has been impossible to use liboffload.
I'll add this clarification to the PR description.
| // Allocate shared memory bound to the device and context associated to the | ||
| // queue Replacing malloc_shared with malloc_host would yield a correct | ||
| // program that allocated device-visible memory on the host. |
There was a problem hiding this comment.
NIT: Comment about malloc_host seems excessive and can be removed.
There was a problem hiding this comment.
Removed c:
3c7b065 to
0d2d8dc
Compare
| if re.match(r"\[offload:.*", line) and not is_offload_preferred_backend_set: | ||
| if re.match(".*Level[ _]Zero.*", line, re.IGNORECASE): | ||
| offload_assigned_backend = config.backend_to_target["level_zero"] | ||
| is_offload_preferred_backend_set = True | ||
| elif re.match(".*nvidia.*", line, re.IGNORECASE): | ||
| offload_assigned_backend = config.backend_to_target["cuda"] |
There was a problem hiding this comment.
Ok, thanks for clarification. We don't run tests on AMD in our CI,
I'm trying to understand what we are trying to achieve.
So, before this change, if we wanted to run tests on cuda via offload backend, we would do something like this:
llvm-lit --param offload_build_target="target-nvidia" --param sycl_devices="offload:gpu" <repo_path>/sycl/test-e2e
To run on L0 probably this:
llvm-lit --param offload_build_target="target-spirv" --param sycl_devices="offload:gpu" <repo_path>/sycl/test-e2e
At least, if I am reading https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/README.md right.
With this change we can still do the same as above, but additionally if "--param offload_build_target" is not provided, we will try to automatically set it. And as you clarified, on machine with both L0 and cuda it will be set to target-spirv.
I think this approach is fine, at least for now.
| // RUN: %{build} -o %t.out | ||
| // RUN: %{run} %t.out | ||
|
|
||
| #include <iostream> |
There was a problem hiding this comment.
| #include <iostream> |
| // Explicitly wait for kernel execution since there is no accessor involved | ||
| myQueue.wait(); | ||
|
|
||
| // Print result |
There was a problem hiding this comment.
| // Print result |
|
@intel/llvm-gatekeepers please consider merging |
0d2d8dc to
ad4cf19
Compare
The liboffload phase 0 program is added as one of SYCL e2e tests. Moreover, lit configuration files are modified so that the liboffload backend is determined according to
sycl-lsoutput, while the backend specified by the user preserves its precedence. When choosing between devices listed bysycl-ls, the Level Zero backend is preferred.The tests would not work without this change when liboffload is built, even with
llvm-lit --param sycl_devices="offload:gpu" <path>. The backend for liboffload would still be undetermined: the tests are not compiled with theOFFLOAD_BUILD_TARGEToption, therefore there is only""left as a value in the dictionary mapping from backends to targets, resulting in a KeyError. Additionally,OFFLOAD_BUILD_TARGETis never mentioned in any CMakeLists.txt, therefore it has been impossible to use liboffload.