oneAPI and GPU support in Intel® Extension for Scikit-learn*
Intel® Extension for Scikit-learn* supports oneAPI concepts, which means that algorithms can be executed on different devices: CPUs and GPUs. This is done via integration with dpctl package that implements core oneAPI concepts like queues and devices.
Prerequisites
For execution on GPU, DPC++ compiler runtime and driver are required. Refer to DPC++ system requirements for details.
DPC++ compiler runtime can be installed either from PyPI or Anaconda:
Install from PyPI:
pip install dpcpp-cpp-rt
Install from Anaconda:
conda install dpcpp_cpp_rt -c intel
Device offloading
Intel® Extension for Scikit-learn* offers two options for running an algorithm on a specific device with the help of dpctl:
Pass input data as dpctl.tensor.usm_ndarray to the algorithm.
The computation will run on the device where the input data is located, and the result will be returned as
usm_ndarray
to the same device.Note
All the input data for an algorithm must reside on the same device.
Warning
The
usm_ndarray
can only be consumed by the base methods likefit
,predict
, andtransform
. Note that only the algorithms in Intel® Extension for Scikit-learn* supportusm_ndarray
. The algorithms from the stock version of scikit-learn do not support this feature.Use global configurations of Intel® Extension for Scikit-learn**:
The
target_offload
option can be used to set the device primarily used to perform computations. Accepted data types arestr
anddpctl.SyclQueue
. If you pass a string totarget_offload
, it should either be"auto"
, which means that the execution context is deduced from the location of input data, or a string with SYCL* filter selector. The default value is"auto"
.The
allow_fallback_to_host
option is a Boolean flag. If set toTrue
, the computation is allowed to fallback to the host device when a particular estimator does not support the selected device. The default value isFalse
.
These options can be set using sklearnex.set_config()
function or
sklearnex.config_context
. To obtain the current values of these options,
call sklearnex.get_config()
.
Note
Functions set_config
, get_config
and config_context
are always patched after the sklearnex.patch_sklearn()
call.
Compatibility considerations
For compatibility reasons, algorithms in Intel® Extension for Scikit-learn* may be offloaded to the device using
daal4py.oneapi.sycl_context
. However, it is recommended to use one of the options
described above for device offloading instead of using sycl_context
.
Example
An example on how to patch your code with Intel CPU/GPU optimizations:
from sklearnex import patch_sklearn, config_context
patch_sklearn()
from sklearn.cluster import DBSCAN
X = np.array([[1., 2.], [2., 2.], [2., 3.],
[8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
with config_context(target_offload="gpu:0"):
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
Note
Current offloading behavior restricts fitting and inference of any models to be in the same context or absence of context. For example, a model trained in the GPU context with target_offload=”gpu:0” throws an error if the inference is made outside the same GPU context.