Skip to content

Conversation

@turswiming
Copy link

Set environment variables to limit thread count for NumPy and related libraries.

In certain environments, OpenBLAS/MKL may still have thread pooling.

Set environment variables to limit thread count for NumPy and related libraries. 
In certain environments, OpenBLAS/MKL may still have thread pooling,
@Kin-Zhang
Copy link
Member

It looks good to me, but I'm wondering is there any speed comparison for this change?

@turswiming
Copy link
Author

Thanks for your reply, I will conduct an ablation study and provide environment details few days latter.

This will fix the delay caused by OpenBLAS / MKL in certain environments.

@turswiming
Copy link
Author

turswiming commented Feb 11, 2026

Using 8 nproc:
With this fix: CPU usage ~800%, 47 data points processed in 11:17 min (avg. 14.4 sec/data)
Without this fix: CPU usage ~1600%, 8 data points processed in 12:30 min (avg. 93.8 sec/data)

The root cause is that when running inside a Docker container, MKL fails to detect the true vCPU limit and instead attempts to utilize all physical CPUs of the host system. This leads to extreme oversubscription, which severely blocks the training procedure and degrades performance.

@turswiming
Copy link
Author

environment.yml

@Kin-Zhang
Copy link
Member

Thanks for the quick reply.

The root cause is that when running inside a Docker container, MKL fails to detect the true vCPU limit and instead attempts to utilize all physical CPUs of the host system.

Does this mean it only affect the program run inside the Docker or it's a general affect?

@Kin-Zhang Kin-Zhang self-requested a review February 12, 2026 00:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants