This document provides essential information on installation, key features, compatibility, and support for Intel® Distribution for Python*.
To learn what's new in the latest release, refer to Release notes.
- General Questions
- System Requirements
- Licensing, Installation, Updates
- Support and Community
- Components and Key Features
- Performance
Q: What is Intel® Distribution for Python*?
A: Intel Distribution for Python is a version of the Python programming language that has been optimized for performance on Intel hardware. It includes performance-enhanced versions of popular Python libraries and tools to accelerate Python workflows.
Q: Why is Intel doing Intel Distribution for Python?
A: Intel created Intel Distribution for Python to address the growing demand for high-performance computing in the Python ecosystem. Python is widely used in scientific computing, data analysis, artificial intelligence, and machine learning, among other fields. However, Python's ease of use often comes at the cost of performance, especially when compared to lower-level languages like C or Fortran.
Q: Does Intel Distribution for Python support all Python libraries?
A: Intel Distribution for Python supports many popular Python libraries, especially those used in scientific and numerical computing. Refer to What's Included in the Intel® Distribution for Python* for a list of supported packages.
Q: How does Intel Distribution for Python differ from Anaconda* Python?
A: The main difference is that Intel Distribution for Python is optimized for performance on Intel hardware. It includes Intel Performance Libraries such as Intel® oneAPI Math Kernel Library (oneMKL), Intel® oneAPI Data Analytics Library (oneDAL), and Intel® Integrated Performance Primitives (Intel® IPP), which can significantly speed up mathematical, scientific, and analytical computations.
Q: Can I run Intel Distribution for Python on a virtual machine or cloud environment?
A: Yes, Intel Distribution for Python can be run on virtual machines and in cloud environments. We recommend trying Intel Distribution for Python at Intel® Tiber™ Developer Cloud.
Q: Are there any known issues with Intel Distribution for Python?
A: As with any software, there may be known issues at the time of release. Check Intel® Distribution for Python* Known Issues for more information about the current release.
Q: Are there any official code samples?
A: Yes, explore GitHub: oneAPI-samples.
Q: Are there any learning resources for Intel Distribution for Python?
A: Yes, consider using Intel® Tiber™ Developer Cloud. Once you create an account, navigate to Training on the left panel to see a list of available Jupyter* notebooks.
Q: Can I install Intel Distribution for Python on any operating system?
A: Intel Distribution for Python is available on Linux* and Windows*. For up-to-date information check System Requirements.
Q: Does Intel Distribution for Python support both 32-bit and 64-bit systems?
A: Intel Distribution for Python supports 64-bit systems.
Q: Does Intel Distribution for Python support Intel's GPUs (graphics processing units), AI (Artificial Intelligence) chips, NVIDIA*, and AMD* GPUs?
A: Yes, there are components that support GPU. Refer to System Requirements for the up-to-date list of supported hardware and explore Heterogeneous Programming Using Data Parallel Extension for Numba* for AI and HPC to learn more about GPU programming with Python.
Q: Do I need a GPU to use Intel Distribution for Python?
A: A dedicated graphics card is not required for Intel Distribution for Python unless the user's specific applications (such as certain machine learning frameworks) require GPU acceleration.
Q: Can I use Intel Distribution for Python on non-Intel hardware? A: While the distribution is optimized for Intel® processors, it can still run on compatible non-Intel processors. However, the performance gains from Intel-specific optimizations may not be reached on other vendors’ hardware.
Q: Can Intel Distribution for Python be used on NVIDIA GPUs?
A: Data Parallel Controls supports NVIDIA* devices. Follow GitHub instructions on how to compile Data Parallel Controls for CUDA devices.
Q: Can Intel Distribution for Python be used on AMD* CPUs?
A: Intel Distribution for Python is primarily optimized for Intel hardware. However, the distribution can still be used on systems with AMD* CPUs, with some considerations: optimized implementations of NumPy and SciPy libraries may still offer performance improvements due to optimizations that are not exclusive to Intel hardware, such as certain SIMD (Single Instruction, Multiple Data) instructions that are also supported by modern AMD* processors.
Q: How much disk space is required to install Intel Distribution for Python?
A: The required disk space can vary depending on the packages and components the user chooses to install. It is recommended to have several gigabytes of free space to accommodate the distribution and additional packages. You can get a minimal required disk space at the Download page. Also, the offline installer notifies you if you do not have enough space on your disk. You can also take advantage of specific optimized libraries from conda-forge* without installing the whole distribution (NumPy, Data Parallel Extension for NumPy, Intel® Extension for Scikit-learn*, etc.)
Q: Are there any software dependencies I need to install before installing Intel Distribution for Python?
A: If using conda-forge* to install Intel Distribution for Python, follow the conda-forge* Installation Instructions to install miniforge* in your environment. Check System Requirements for specific requirements.
Q: Is Intel Distribution for Python free to use?
A: Yes, Intel Distribution for Python is available for free. You can download and use it without any licensing fees.
Q: How can I install Intel Distribution for Python?
A: Intel Distribution for Python can be installed using the conda-forge* distribution by creating a new environment and with a standalone installer. Refer to Get Started With Intel® Distribution for Python* for more details.
Q: How does Intel Distribution for Python installed by the offline installer differ from Intel Python in conda-forge*?
A: There is no difference. The offline installer and conda-forge* are simply different distribution channels. The offline installer is best suited for environments where internet access is limited or where a standalone installation is preferred. The conda-forge*-based installation offers the convenience of package and environment management, as well as easier updates, but it requires internet access for initial setup and for accessing package repositories. Refer to Get Started With Intel® Distribution for Python* for detailed installation instructions for both.
Q: Do I need an internet connection to install Intel Distribution for Python?
A: If you are using the conda* package manager, an internet connection is typically required to download and install packages. However, Intel also provides offline installer for environments without internet access.
Q: What versions of Python are available with Intel Distribution for Python?
A: Intel Distribution for Python typically supports recent versions of Python. You should check the Anaconda Cloud channel or run python --version or conda info after installing Intel Distribution for Python to get the version.
Q: How do I update Intel Distribution for Python?
A: You could update Intel Distribution for Python using the conda* package manager:
conda activate myenv
conda update intel::intelpython
or download the latest version from the official website.
Q: Will I automatically receive updates for Intel Distribution for Python?
A: No, updates are not automatic. To update:
- if you used conda-forge* to install, check for updates and apply them using the conda command (conda update -all),
- if you used the offline installer, download the latest version from the official website.
Q: How can I uninstall Intel Distribution for Python?
A: If you installed it using conda-forge*, uninstall Intel Distribution for Python using conda command. Otherwise, run the installer:
- On Windows: Installed Apps -> find Intel Distribution for Python and run installer in Uninstall mode.
- On Linux: find the installer folder and run the installer in Uninstall mode.
Q: What should I do if I encounter issues after updating Intel Distribution for Python?
A: If you encounter issues after an update, you can try rolling back to a previous version of the package or environment. If the problem persists, you can seek support from support forums, or check Intel® Distribution for Python* Known Issues. With your active Priority Support, you can get assistance for your Intel® toolkits through the Online Service Center.
Q: What is the license for Intel Distribution for Python?
A: You can find more details at End User License Agreement (EULA).
Q: Can I use Intel Distribution for Python for commercial purposes?
A: You can find more details at End User License Agreement (EULA).
Q: Is Intel Distribution for Python open source?
A: Intel Distribution for Python is built upon the standard Python, which is open source, and it includes various Intel Performance Libraries that are provided for free and are often open source themselves. Intel's approach is to provide a pre-packaged, performance-optimized version of Python that can be used out of the box. While the distribution integrates and optimizes open-source Python libraries, the specific build, components and optimizations applied by Intel are proprietary.
Q: Can I contribute to Intel Distribution for Python?
A: Contributions are welcome. You can explore GitHub repository https://github.com/IntelPython for more details.
Q: What should I do if I encounter installation issues with Intel Distribution for Python?
A: If you encounter issues during installation and your system meets all of the prerequisites, you should try restarting your machine. If the problem persists, you can open a new topic on support forums. With your active Priority Support, you can get assistance for your Intel® toolkits through the Online Service Center.
Q: Is there any support or community for Intel Distribution for Python?
A: Yes, Intel provides support for the distribution, and there is also a community of users https://community.intel.com/t5/Intel-Distribution-for-Python/bd-p/distribution-python
Q: Where can I find support for Intel Distribution for Python?
A: You can find support forums, documentation, and resources on the official Intel website and GitHub repository.
Q: How do I report bugs or request features for Intel Distribution for Python?
A: You can open a new topic on support forums. With your active Priority Support, you can get assistance for your Intel® toolkits through the Online Service Center.
Q: How can I start using Intel® Extension for Scikit-learn*?
A: Once you complete the steps in the Get Started Guide, you can start using Intel® Extension for Scikit-learn*, taking advantage of the performance optimizations provided by Intel. Also, you can find installation instructions on GitHub.
Q: Which algorithms are supported by Intel® Extension for Scikit-learn*?
A: Refer to Supported Algorithms.
Q: How can I enable Intel® Extension for Scikit-learn* in my Python code?
A: You can use the following code:
# Turn on scikit-learn optimizations with these 2 simple lines:
from sklearnex import patch_sklearn
patch_sklearn()
# Import scikit-learn algorithms after the patch is enabled
from sklearn.cluster import KMeans `
Q: Are there any tutorials or code samples available for learning how to use Intel® Extension for Scikit-learn*?
A: You can explore examples on GitHub by following oneAPI Samples -> AI and Analytics Samples. Also, visit Github: scikit-learn-intelex for more.
Q: Is there integration with SYCL?
A: Yes! Consider using Data Parallel Control (dpctl) or Data Parallel Extension for Numba* (numba-dpex). Data Parallel Control provides C and Python bindings for SYCL 2020. It is used when there is a need to explicitly manage SYCL devices and memory within Python code. Data Parallel Extension for Numba* is a free and open-source LLVM-based code generator for portable accelerator programming in Python. It is an extension to the open-source Numba* Just-In-Time (JIT) compiler that enables Python functions to be offloaded to SYCL-compatible devices.
Q: What are the key differences between Data Parallel Extension for Numba* and the standard Numba* JIT compiler?
A: The Data Parallel Extension for Numba* kernel API is developed with the aim of providing SYCL*-like kernel programming features directly in Python. Refer to SYCL* and numba-dpex Feature Comparison for a summary of the SYCL* kernel programming features that are currently supported in Data Parallel Extension for Numba* kernel API. It is important to note, however, that Data Parallel Extension for Numba* does not implement wrappers or analogs of SYCL’s host-callable runtime API. Such features are provided by the Data Parallel Control package.
Q: Are there any tutorials or code samples available for learning how to use Data Parallel Extension for Numba*?
A: There is a tutorial covering the Data Parallel Extension for Numba* kernel programming API (kapi) and introduces the concepts needed to write data-parallel kernels in Data Parallel Extension for Numba*
Q: How can I debug applications that use Data Parallel Extension for Numba*?
A: Data Parallel Extension for Numba* allows you to debug SYCL* kernels with Intel® Distribution for GDB*. You can access the tutorial by following this link.
Q: Which devices can be targeted by using Data Parallel Control?
A: In its current form, Data Parallel Control relies on certain DPC++ extensions of the SYCL standard. Moreover, the binary distribution of Data Parallel Control uses the proprietary Intel(R) oneAPI DPC++ Compiler runtime bundled as part of oneAPI and only supports Intel XPUs. However, Data Parallel Control is compatible with the runtime of the open-source DPC++ SYCL bundle that can be compiled to support a wide range of architectures including CUDA*, AMD* ROC, and HIP*.
Q: What packages are included in the Data Parallel Extensions for Python?
A: Data Parallel Extension for Numpy*, Data Parallel Extension for Numba*, Data Parallel Control library.
Q: Are there any demos for Data Parallel Extensions for Python?
A: Yes, currently there are two demos available: the Monte Carlo method to estimate the value of Pi, and a visualization of the breathtaking process of diving in the famous Mandelbrot fractal. You can find them on our GitHub: DPEP > demos.
Q: Are there any tutorials or code samples available for learning how to use Data Parallel Extensions for Python?
A: You can explore examples on GitHub by following dpctl > examples > cython and DPEP > examples.
Q: How can I start using Intel accelerated NumPy?
A: Once you complete Get Started Guide, you can start using Intel-accelerated versions of NumPy, taking advantage of the performance optimizations provided by Intel.
Q: How does Intel's accelerated NumPy maintain consistent floating-point arithmetic across various accelerators to ensure the portability and robustness of my application?
A: When it comes to program GPUs and especially specialized accelerators, the set of supported primitive data types may be limited. For example, certain GPUs may not support double precision or half-precision. Data Parallel Extensions for Python select default dtype depending on device’s default type in accordance with Python Array API Standard. It can be either float64 or float32. It means that unlike traditional Numpy* programming on a CPU, the heterogeneous computing requires careful management of hardware peculiarities to keep the Python script portable and robust on any device. Refer to Writing Robust Numerical Codes for Heterogeneous Computing for more details. Also, consider reading scientific papers Important for understanding how to write robust numerical code and Consistency of Floating-Point Results using the Intel® Compiler or Why doesn’t my application always give the same answer? if you want to learn more on floating point arithmetic basics.
Q: Does the Intel Distribution for Python help me accelerate Pytorch* or TensorFlow*?
A: Consider using PyTorch* Optimizations from Intel and TensorFlow* Optimizations from Intel respectively.
Q: What optimizations are shipped in Intel Distribution for Python?
A: It depends on the package you are using. Intel® oneAPI Math Kernel Library (oneMKL) and Intel® oneAPI Data Analytics Library bring near-native performance through acceleration of core numerical and machine learning packages. Refer to Accelerating Scientific Python with Intel Optimizations. The Intel Distribution for Python is fully optimized for the latest Intel CPUs, ensuring that users can leverage the full potential of the most recent instruction sets and CPU cores.
Q: How Intel Distribution for Python support accelerated computing?
A: Intel Distribution for Python offers Data Parallel Extensions for Python* to enable standards-based accelerated computing on CPUs and GPUs without using low-level proprietary programming API.
Q: What are Data Parallel Extensions for Python, and how can they benefit my data-intensive applications?
A: Data Parallel Extensions for Python* extend numerical Python capabilities beyond CPU and allow even higher performance gains on data parallel devices, such as GPUs. It consists of three foundational packages:
- dpnp - Data Parallel Extension for Numpy* - a library that implements a subset of Numpy that can be executed on any data parallel device. The subset is a drop-in replacement of core Numpy functions and numerical data types.
- numba_dpex - Data Parallel Extension for Numba* - an extension for Numba compiler that lets you program data-parallel devices as you program CPU with Numba.
- dpctl - Data Parallel Control library that provides utilities for device selection, allocation of data on devices, tensor data structure along with Python* Array API Standard implementation, and support for creation of user-defined data-parallel extensions.
Q: How much faster is Intel Distribution for Python compared to the standard Python distribution?
A: The answer can vary depending on the workload, the libraries used, and the hardware configuration. Deliver Blazing-Fast Python* Data Science and AI Performance on CPUs—with Minimal Code Changes can be a good start. Also, review benchmarking results published at Intel® Distribution for Python* official website.
Q: How do I know if my code is running faster with Intel Distribution for Python?
A: You can benchmark your code using standard Python profiling tools or by measuring the execution time of your scripts before and after switching to Intel Distribution for Python. We did our own experiments: visit Intel® Distribution for Python* official website for the latest benchmarking results.
Q: Will I see performance improvements in all my Python applications with Intel Distribution for Python?
A: The performance improvements you see with Intel Distribution for Python will depend on the nature of your Python applications and how they utilize the optimized libraries provided by the distribution. Intel Distribution for Python provides optimized versions of popular libraries such as NumPy, SciPy, and scikit-learn, among others. If your application heavily relies on these libraries for numerical computations, data analysis, or machine learning tasks, you are more likely to see performance gains. We recommend using Intel VTune Profiler for profiling machine learning workloads.
Q: Is it possible to execute Python code in parallel?
A: Yes, in many ways. Intel NumPy and SciPy are accelerated with an optimized math library such as Intel® oneAPI Math Kernel Library and provide nested parallelism. If you need to use non-standard math, consider trying Numba which acts as a “Just-In-Time” (JIT) compiler based on LLVM. It works to close the performance gap between Python and statically typed, compiled languages like C and C++. It also supports multiple threading runtimes, such as Intel® oneAPI Threading Building Blocks (oneTBB), OpenMP*, and work queue. Refer to a Needle and Thread – An Easy Guide to Multithreading in Python*.
Q: How can I accelerate Python computations using oneMKL?
A: Install Distribution for Python and start using Intel NumPy and SciPy. These libraries utilize oneMKL performance optimizations automatically, so you'll benefit from faster computations without additional configuration.
Q: How do libraries optimized by Intel like NumPy and SciPy achieve better performance?
A: They are using oneMKL under the hood. Refer to Accelerating Scientific Python with Intel Optimizations for more details.
Q: Are there any tips or best practices for achieving maximum performance with Intel Distribution for Python?
A: Yes, there are several tips and best practices you can follow to achieve maximum performance with Intel Distribution for Python:
- Use Intel-optimized Libraries: Ensure that you are using Intel-optimized versions of libraries such as NumPy, SciPy, and scikit-learn that come with Intel Distribution for Python. These libraries are pre-tuned to take advantage of Intel's performance enhancements.
- Leverage Multi-threading: Take advantage of the automatic multi-threading capabilities provided by Intel MKL. You can control the number of threads used by MKL with environment variables like MKL_NUM_THREADS to match your system's CPU core count.
- Profile Your Code: Consider using Intel VTune Profiler for profiling machine learning workloads to identify bottlenecks in your code.
- Keep Libraries Up to Date: Regularly update your Intel Distribution for Python and its libraries to benefit from the latest performance improvements and bug fixes.
Q: Are there any tips or best practices for debugging and tuning performance with Intel Distribution for Python?
A: Yes, there are several tips and best practices you can follow to debug performance issues in Python applications:
- Python Time Measurements: Start with simple timing using Python's built-in timeit module to measure the execution time of different parts of your code.
- Python Profiling Tools: Learn more about native Python profilers, [cProfile(https://docs.python.org/3/library/profile.html#module-cProfile) and profile in combination with visualization tools like SnakeViz* to visualize cProfile output.
- Intel Tools: refer to Intel VTune Profiler for profiling machine learning workloads and learn how to identify hotspots, get insights on CPU and Memory utilization, and more. Consider using Intel® Advisor to profile loops in your application and understand where your code may benefit from parallel execution.
- Review I\O Operations: If your application reads from or writes to disk or intensively uses network, I/O might be a bottleneck. Use tools like strace* to monitor and analyze I/O patterns.
Q: Are there benchmarks for Intel® Extension for Scikit-learn*?
A: Visit Intel® Distribution for Python* official website for the latest benchmarking results. In addition, look at Intel(R) Extension for Scikit-learn* Github > Acceleration page.
Q: What has Intel done to speed up XGBoost?
A: A lot of optimizations speed up XGboost, like automatic memory prefetching, reduced memory consumption, and parallelization. XGboost supports single node and distributed training. For in-depth explanations, turn to An Easy Introduction to XGBoost: A Comprehensive Guide to the Library and Intel Optimizations. See the Intel® Optimization for XGBoost* official website for the benchmarks.
Q: What should I do if I don't see the expected performance improvements with Intel Distribution for Python?
A: Consider reading Intel VTune Profiler for profiling machine learning workloads to identify bottlenecks in your code. Should profiling fail to shed light on the performance bottlenecks you're encountering, we encourage you to seek assistance through our support forums. For more immediate and in-depth assistance, consider contacting Priority Support via Online Service Center.