Features

Ease-of-use Python API

Intel® Extension for PyTorch* provides simple frontend Python APIs and utilities to get performance optimizations such as operator optimization.

Check the API Documentation for details of API functions and Examples for helpful usage tips.

DPC++ Extension

Intel® Extension for PyTorch* provides C++ APIs to get SYCL queue and configure floating-point math mode.

Check the API Documentation for the details of API functions. DPC++ Extension describes how to write customized DPC++ kernels with a practical example and build it with setuptools and CMake.

Here are detailed discussions of specific feature topics, summarized in the rest of this document:

Channels Last

Compared with the default NCHW memory format, using channels_last (NHWC) memory format can further accelerate convolutional neural networks. In Intel® Extension for PyTorch*, NHWC memory format has been enabled for most key GPU operators.

For more detailed information, check Channels Last.

Auto Mixed Precision (AMP)

The support of Auto Mixed Precision (AMP) with BFloat16 and Float16 optimization of operators has been enabled in Intel® Extension for PyTorch*. BFloat16 is the default low precision floating data type when AMP is enabled. We suggest use AMP for accelerating convolutional and matmul based neural networks.

For more detailed information, check Auto Mixed Precision (AMP).

Advanced Configuration

The default settings for Intel® Extension for PyTorch* are sufficient for most use cases. However, if users want to customize Intel® Extension for PyTorch*, advanced configuration is available at build time and runtime.

For more detailed information, check Advanced Configuration.

Optimizer Optimization

Optimizers are a key part of the training workloads. Intel® Extension for PyTorch* supports operator fusion for computation in the optimizers.

For more detailed information, check Optimizer Fusion.

Simple Trace Tool

Simple Trace is a built-in debugging tool that lets you control printing out the call stack for a piece of code. Once enabled, it can automatically print out verbose messages of called operators in a stack format with indenting to distinguish the context.

For more detailed information, check Simple Trace Tool.