-
Notifications
You must be signed in to change notification settings - Fork 268
Description
I see that there are xe KMD support and DG1 support, but I can't get them working together. clinfo -l with the following debug settings shows:
export NEOReadDebugKeys=1
export PrintDriverDiagnostics=5
export PrintDebugSettings=1
export PrintDebugMessages=1
export PrintXeLogs=1
export PrintBOCreateDestroyResult=1
# clinfo -l
Non-default value of debug variable: PrintDriverDiagnostics = 5
Non-default value of debug variable: PrintDebugSettings = 1
Non-default value of debug variable: PrintDebugMessages = 1
Non-default value of debug variable: PrintXeLogs = 1
Non-default value of debug variable: PrintBOCreateDestroyResult = 1
Shared System USM NOT allowed: KMD does not support
EXT_SET_PAT support is: disabled
INFO: System Info query failed!
WARNING: Failed to query memory info
WARNING: Failed to query engine info
WARNING: Topology query failed!
FATAL: Cannot query EU total parameter!
Platform #0: Intel(R) OpenCL
`-- Device #0: AMD Ryzen 9 5900X 12-Core Processor
strace clinfo -l output (the ioctl operations):
...
openat(AT_FDCWD, "/dev/dri/by-path", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 5
fstat(5, {st_mode=S_IFDIR|0755, st_size=80, ...}) = 0
getdents64(5, 0x571b5acd2090 /* 4 entries */, 32768) = 144
getdents64(5, 0x571b5acd2090 /* 0 entries */, 32768) = 0
close(5) = 0
openat(AT_FDCWD, "/dev/dri/by-path/pci-0000:2f:00.0-render", O_RDWR|O_CLOEXEC) = 5
ioctl(5, DRM_IOCTL_VERSION, 0x7fff929d60a0) = 0
ioctl(5, DRM_IOCTL_VERSION, 0x7fff929d6080) = 0
ioctl(5, DRM_IOCTL_XE_DEVICE_QUERY, 0x7fff929d6140) = 0
ioctl(5, DRM_IOCTL_XE_DEVICE_QUERY, 0x7fff929d6140) = 0
write(2, "Shared System USM NOT allowed: K"..., 52Shared System USM NOT allowed: KMD does not support
) = 52
ioctl(5, DRM_IOCTL_VERSION, 0x7fff929d5cf0) = 0
ioctl(5, DRM_IOCTL_I915_GEM_CREATE_EXT, 0x7fff929d5bf0) = -1 EINVAL
write(1, "EXT_SET_PAT support is: disabled"..., 33EXT_SET_PAT support is: disabled
) = 33
ioctl(5, DRM_IOCTL_I915_REG_READ, 0x7fff929d5bf0) = -1 EINVAL
ioctl(5, DRM_IOCTL_I915_REG_READ, 0x7fff929d5bf0) = -1 EINVAL
readlink("/proc/self/exe", "/usr/bin/clinfo", 511) = 15
ioctl(5, DRM_IOCTL_I915_QUERY, 0x7fff929d5ca0) = -1 EINVAL
write(1, "INFO: System Info query failed!\n", 32INFO: System Info query failed!
) = 32
ioctl(5, DRM_IOCTL_I915_QUERY, 0x7fff929d5b80) = -1 EINVAL
write(2, "WARNING: Failed to query memory "..., 37WARNING: Failed to query memory info
) = 37
ioctl(5, DRM_IOCTL_I915_QUERY, 0x7fff929d5a20) = -1 EINVAL
write(2, "WARNING: Failed to query engine "..., 37WARNING: Failed to query engine info
) = 37
ioctl(5, DRM_IOCTL_I915_QUERY, 0x7fff929d5b30) = -1 EINVAL
write(2, "WARNING: Topology query failed!\n", 32WARNING: Topology query failed!
) = 32
ioctl(5, DRM_IOCTL_I915_GETPARAM, 0x7fff929d5bf0) = -1 EINVAL
write(2, "FATAL: Cannot query EU total par"..., 40FATAL: Cannot query EU total parameter!
) = 40
close(5) = 0
munmap(0x720be1000000, 35286328) = 0
munmap(0x720be3643000, 864456) = 0
close(4) = 0
...
By looking into the code, I found that the runtime queries the device with IoctlHelperXe if xe KMD is found:
compute-runtime/shared/source/os_interface/linux/drm_neo.cpp
Lines 2016 to 2023 in 2459623
| bool Drm::queryDeviceIdAndRevision() { | |
| auto drmVersion = Drm::getDrmVersion(getFileDescriptor()); | |
| if ("xe" == drmVersion) { | |
| this->setPerContextVMRequired(false); | |
| return IoctlHelperXe::queryDeviceIdAndRevision(*this); | |
| } | |
| return IoctlHelperI915::queryDeviceIdAndRevision(*this); | |
| } |
... but sets up a product-specific ioctl helper for DG1:
compute-runtime/shared/source/os_interface/linux/drm_neo.cpp
Lines 1258 to 1273 in 2459623
| void Drm::setupIoctlHelper(const PRODUCT_FAMILY productFamily) { | |
| if (!this->ioctlHelper) { | |
| auto drmVersion = Drm::getDrmVersion(getFileDescriptor()); | |
| auto productSpecificIoctlHelperCreator = ioctlHelperFactory[productFamily]; | |
| if (productSpecificIoctlHelperCreator && !debugManager.flags.IgnoreProductSpecificIoctlHelper.get()) { | |
| this->ioctlHelper = productSpecificIoctlHelperCreator.value()(*this); | |
| } else if ("xe" == drmVersion) { | |
| this->ioctlHelper = IoctlHelperXe::create(*this); | |
| } else { | |
| std::string prelimVersion = ""; | |
| getPrelimVersion(prelimVersion); | |
| this->ioctlHelper = IoctlHelper::getI915Helper(productFamily, prelimVersion, *this); | |
| } | |
| this->ioctlHelper->initialize(); | |
| } | |
| } |
... which is an IoctlHelperImpl<IGFX_DG1>:
compute-runtime/shared/source/os_interface/linux/local/dg1/enable_ioctl_helper_dg1.cpp
Lines 12 to 18 in 2459623
| struct EnableProductIoctlHelperDg1 { | |
| EnableProductIoctlHelperDg1() { | |
| ioctlHelperFactory[IGFX_DG1] = IoctlHelperImpl<IGFX_DG1>::get; | |
| } | |
| }; | |
| static EnableProductIoctlHelperDg1 enableIoctlHelperDg1; |
... and it is inherited from the IoctlHelperUpstream then IoctlHelperI915, instead of the IoctlHelperXe.
compute-runtime/shared/source/os_interface/linux/ioctl_helper.h
Lines 384 to 385 in 2459623
| template <PRODUCT_FAMILY gfxProduct> | |
| class IoctlHelperImpl : public IoctlHelperUpstream { |
| class IoctlHelperUpstream : public IoctlHelperI915 { |
So the device is queried with i915 ioctl commands and causes the clinfo failed to probe the device.
The clinfo works fine with i915 KMD:
# clinfo -l
Non-default value of debug variable: PrintDriverDiagnostics = 5
Non-default value of debug variable: PrintDebugSettings = 1
Non-default value of debug variable: PrintDebugMessages = 1
Non-default value of debug variable: PrintXeLogs = 1
Non-default value of debug variable: PrintBOCreateDestroyResult = 1
EXT_SET_PAT support is: disabled
INFO: System Info query failed!
WARNING: Failed to request OCL Turbo Boost
Created new BO with GEM_USERPTR, handle: BO-1
NEO_CACHE_PERSISTENT is enabled. Cache is located in: /root/.cache/neo_compiler_cache
Performing GEM_CREATE_EXT with { size: 4096, memory class: 0, memory instance: 0 }
GEM_CREATE_EXT with EXT_MEMORY_REGIONS has returned: 0 BO-2 with size: 4096
Performing GEM_CREATE_EXT with { size: 4096, memory class: 0, memory instance: 0 }
GEM_CREATE_EXT with EXT_MEMORY_REGIONS has returned: 0 BO-3 with size: 4096
Performing GEM_CREATE_EXT with { size: 4096, memory class: 0, memory instance: 0 }
GEM_CREATE_EXT with EXT_MEMORY_REGIONS has returned: 0 BO-4 with size: 4096
Performing GEM_CREATE_EXT with { size: 4096, memory class: 0, memory instance: 0 }
GEM_CREATE_EXT with EXT_MEMORY_REGIONS has returned: 0 BO-5 with size: 4096
Performing GEM_CREATE_EXT with { size: 4096, memory class: 0, memory instance: 0 }
GEM_CREATE_EXT with EXT_MEMORY_REGIONS has returned: 0 BO-6 with size: 4096
computeUnitsUsedForScratch: 768
hwInfo: {96, 672}: (16, 1, 6)
Platform #0: Intel(R) OpenCL Graphics
`-- Device #0: Intel(R) Iris(R) Xe MAX Graphics
Platform #1: Intel(R) OpenCL
`-- Device #0: AMD Ryzen 9 5900X 12-Core Processor
Calling gem close on handle: BO-2
Calling gem close on handle: BO-3
Calling gem close on handle: BO-4
Calling gem close on handle: BO-5
Calling gem close on handle: BO-6
Calling gem close on handle: BO-1
Environment
- CPU: AMD Ryzen 9 5900X
- GPU: Intel Corporation DG1 [Iris Xe MAX Graphics] [8086:4905] (rev 01)
- MB: MSI MAG X570 TOMAHAWK WIFI (MS-7C84) (UEFI boot, resizable BAR and above 4G decoding enabled, CSM disabled)
- OS: Ubuntu 24.04.4 LTS
- Kernel: 6.17.0-14-generic
- Kernel cmdline:
xe.force_probe=4905 acpi_enforce_resources=lax pci=nommconf pcie_acs_override=downstream pcie_aspm=off - Driver: xe (
i915blacklisted inmodprobe.d) - OpenCL loader: Khronos OpenCL ICD Loader 3.0.7
- OpenCL runtime (ICD): intel-opencl-icd 26.05.37020.3-1~24.04~ppa2
- Level Zero: libze-intel-gpu1 26.05.37020.3-1~24.04~ppa2, libze1 1.27.0-1~24.04~ppa1
- GMM: libigdgmm12 22.9.0-1~24.04~ppa1
- Intel GSC: 0.9.5-1~24.04~ppa2
- linux-firmware: 20240318.git3b128b60-0ubuntu2.25