Advanced Configuration
The default settings for Intel® Extension for PyTorch* are sufficient for most use cases. However, if users want to customize Intel® Extension for PyTorch*, advanced configuration is available at build time and runtime.
Build Time Configuration
The following build options are supported by Intel® Extension for PyTorch*. Users who install Intel® Extension for PyTorch* via source compilation could override the default configuration by explicitly setting a build option ON or OFF, and then build.
Build Option | Default Value |
Description |
---|---|---|
USE_ONEMKL | ON | Use oneMKL BLAS |
USE_CHANNELS_LAST_1D | ON | Use channels last 1d |
USE_PERSIST_STREAM | ON | Use persistent oneDNN stream |
USE_SCRATCHPAD_MODE | ON | Use oneDNN scratchpad mode |
USE_PRIMITIVE_CACHE | ON | Cache oneDNN primitives by FRAMEWORK for specific operators |
USE_QUEUE_BARRIER | ON | Use queue submit_barrier, otherwise use dummy kernel |
USE_MULTI_CONTEXT | OFF | Create DPC++ runtime context per device |
USE_PROFILER | ON | USE XPU Legacy Profiler in build. |
USE_KINETO | ON | USE PyTorch Kineto in build. |
USE_SYCL_ASSERT | OFF | Enables assert in sycl kernel |
USE_ITT_ANNOTATION | OFF | Enables ITT annotation in sycl kernel |
USE_SPLIT_FP64_LOOPS | ON | Split FP64 loops into separate kernel for element-wise kernels |
USE_XETLA | ON | Use XeTLA based customer kernels |
BUILD_BY_PER_KERNEL | OFF | Build by DPC++ per_kernel option (exclusive with USE_AOT_DEVLIST) |
BUILD_INTERNAL_DEBUG | OFF | Use internal debug code path |
BUILD_SEPARATE_OPS | OFF | Build each operator in separate library |
BUILD_SIMPLE_TRACE | ON | Build simple trace for each registered operator |
USE_AOT_DEVLIST | "" | Set device list for AOT build |
BUILD_OPT_LEVEL | "" | Add build option -Ox, accept values: 0/1 |
For above build options which can be configured to ON or OFF, users can configure them to 1 or 0 also, while ON equals to 1 and OFF equals to 0.
Runtime Configuration
The following launch options are supported in Intel® Extension for PyTorch*. Users who execute AI models on XPU could override the default configuration by explicitly setting the option value at runtime using environment variables, and then launch the execution.
Launch Option CPU, GPU |
Default Value |
Description |
---|---|---|
IPEX_FP32_MATH_MODE | FP32 | Set values for FP32 math mode (valid values: FP32, TF32, BF32). Refer to API Documentation for details. |
Launch Option GPU ONLY |
Default Value |
Description |
---|---|---|
IPEX_VERBOSE | 0 | Set verbose level with synchronization execution mode |
IPEX_XPU_SYNC_MODE | 0 | Set 1 to enforce synchronization execution mode |
IPEX_TILE_AS_DEVICE | 1 | Set 0 to disable tile partition and map per root device Only works when ZE_FLAT_DEVICE_HIERARCHY=COMPOSITE |
Launch Option Experimental |
Default Value |
Description |
---|---|---|
IPEX_SIMPLE_TRACE | 0 | Set 1 to enable simple trace for all operators* |
IPEX_ZE_TRACING | 0 | Set 1 to enable kineto profiling based-on level zero tracing |
For above launch options which can be configured to 1 or 0, users can configure them to ON or OFF also, while ON equals to 1 and OFF equals to 0.
Examples to configure the launch options:
Set one or more options before running the model
export IPEX_VERBOSE=1
export IPEX_FP32_MATH_MODE=TF32
...
python ResNet50.py
Set one option when running the model
IPEX_VERBOSE=1 python ResNet50.py
Set more than one options when running the model
IPEX_VERBOSE=1 IPEX_FP32_MATH_MODE=TF32 python ResNet50.py