oneAPI and GPU support in Intel® Extension for Scikit-learn*

Intel® Extension for Scikit-learn* supports oneAPI concepts, which means that algorithms can be executed on different devices: CPUs and GPUs. This is done via integration with dpctl package that implements core oneAPI concepts like queues and devices.

Prerequisites

For execution on GPU, DPC++ compiler runtime and driver are required. Refer to DPC++ system requirements for details.

DPC++ compiler runtime can be installed either from PyPI or Anaconda:

  • Install from PyPI:

    pip install dpcpp-cpp-rt
    
  • Install from Anaconda:

    conda install dpcpp_cpp_rt -c intel
    

Device offloading

Intel® Extension for Scikit-learn* offers two options for running an algorithm on a specific device with the help of dpctl:

  • Pass input data as dpctl.tensor.usm_ndarray to the algorithm.

    The computation will run on the device where the input data is located, and the result will be returned as usm_ndarray to the same device.

    Note

    All the input data for an algorithm must reside on the same device.

    Warning

    The usm_ndarray can only be consumed by the base methods like fit, predict, and transform. Note that only the algorithms in Intel® Extension for Scikit-learn* support usm_ndarray. The algorithms from the stock version of scikit-learn do not support this feature.

  • Use global configurations of Intel® Extension for Scikit-learn**:

    1. The target_offload option can be used to set the device primarily used to perform computations. Accepted data types are str and dpctl.SyclQueue. If you pass a string to target_offload, it should either be "auto", which means that the execution context is deduced from the location of input data, or a string with SYCL* filter selector. The default value is "auto".

    2. The allow_fallback_to_host option is a Boolean flag. If set to True, the computation is allowed to fallback to the host device when a particular estimator does not support the selected device. The default value is False.

These options can be set using sklearnex.set_config() function or sklearnex.config_context. To obtain the current values of these options, call sklearnex.get_config().

Note

Functions set_config, get_config and config_context are always patched after the sklearnex.patch_sklearn() call.

Compatibility considerations

For compatibility reasons, algorithms in Intel® Extension for Scikit-learn* may be offloaded to the device using daal4py.oneapi.sycl_context. However, it is recommended to use one of the options described above for device offloading instead of using sycl_context.

Example

An example on how to patch your code with Intel CPU/GPU optimizations:

from sklearnex import patch_sklearn, config_context
patch_sklearn()

from sklearn.cluster import DBSCAN

X = np.array([[1., 2.], [2., 2.], [2., 3.],
            [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
with config_context(target_offload="gpu:0"):
   clustering = DBSCAN(eps=3, min_samples=2).fit(X)

Note

Current offloading behavior restricts fitting and inference of any models to be in the same context or absence of context. For example, a model trained in the GPU context with target_offload=”gpu:0” throws an error if the inference is made outside the same GPU context.