oneAPI and GPU support in Intel® Extension for Scikit-learn*¶
Intel® Extension for Scikit-learn* supports oneAPI concepts, which means that algorithms can be executed on different devices: CPUs and GPUs. This is done via integration with dpctl package that implements core oneAPI concepts like queues and devices.
For execution on GPU, DPC++ compiler runtime and driver are required. Refer to DPC++ system requirements for details.
DPC++ compiler runtime can be installed either from PyPI or Anaconda:
Install from PyPI:
pip install dpcpp-cpp-rt
Install from Anaconda:
conda install dpcpp_cpp_rt -c intel
Intel® Extension for Scikit-learn* offers two options for running an algorithm on a specific device with the help of dpctl:
Pass input data as dpctl.tensor.usm_ndarray to the algorithm.
The computation will run on the device where the input data is located, and the result will be returned as
usm_ndarrayto the same device.
All the input data for an algorithm must reside on the same device.
usm_ndarraycan only be consumed by the base methods like
transform. Note that only the algorithms in Intel® Extension for Scikit-learn* support
usm_ndarray. The algorithms from the stock version of scikit-learn do not support this feature.
Use global configurations of Intel® Extension for Scikit-learn**:
target_offloadoption can be used to set the device primarily used to perform computations. Accepted data types are
dpctl.SyclQueue. If you pass a string to
target_offload, it should either be
"auto", which means that the execution context is deduced from the location of input data, or a string with SYCL* filter selector. The default value is
allow_fallback_to_hostoption is a Boolean flag. If set to
True, the computation is allowed to fallback to the host device when a particular estimator does not support the selected device. The default value is
These options can be set using
sklearnex.set_config() function or
sklearnex.config_context. To obtain the current values of these options,
are always patched after the
For compatibility reasons, algorithms in Intel® Extension for Scikit-learn* may be offloaded to the device using
daal4py.oneapi.sycl_context. However, it is recommended to use one of the options
described above for device offloading instead of using
An example on how to patch your code with Intel CPU/GPU optimizations:
from sklearnex import patch_sklearn, config_context patch_sklearn() from sklearn.cluster import DBSCAN X = np.array([[1., 2.], [2., 2.], [2., 3.], [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) with config_context(target_offload="gpu:0"): clustering = DBSCAN(eps=3, min_samples=2).fit(X)