Ahead of Time (AOT) Compilation

Introduction

AOT Compilation is a helpful feature for development lifecycle or distribution time, when you know beforehand what your target device is going to be at application execution time. When AOT compilation is enabled, no additional compilation time is needed when running application. It also benifits the product quality since no just-in-time (JIT) bugs encountered as JIT is skipped and final code executing on the target device can be tested as-is before delivery to end-users. The disadvantage of this feature is that the final distributed binary size will be increased a lot (e.g. from 500MB to 2.5GB for Intel® Extension for PyTorch*).

Use case

Intel® Extension for PyTorch* provides build option USE_AOT_DEVLIST for users who install Intel® Extension for PyTorch* via source compilation to configure device list for AOT compilation. The target device in device list is specified by DEVICE type of the target. Multi-target AOT compilation is supported by using a comma (,) as a delimiter in device list. See below table for the AOT setting targeting Intel® Data Center GPU Flex Series & Intel® Arc™ A-Series GPUs.

Supported HW	AOT Setting
Intel® Data Center GPU Flex Series 170	USE_AOT_DEVLIST='ats-m150'
Intel® Data Center GPU Max Series	USE_AOT_DEVLIST='pvc'
Intel® Arc™ A-Series	USE_AOT_DEVLIST='ats-m150'

Note: Multiple AOT settings can be used together by seperating setting texts with a comma (,) to make the compiled wheel file have multiple AOT supports. E.g. a wheel file built with USE_AOT_DEVLIST='ats-m150,pvc' has both ats-m150 and pvc AOT enabled. Since PyTorch 2.6, if USE_AOT_DEVLIST is not set, Intel® Extension for PyTorch* will default to using the same AOT settings as PyTorch.

Intel® Extension for PyTorch* enables AOT compilation for Intel GPU target devices in prebuilt wheel files. Intel® Data Center GPU Flex Series 170 and Intel® Data Center GPU Max Series are the enabled target devices in current release, with Intel® Arc™ A-Series GPUs having prototype support. If Intel® Extension for PyTorch* is executed on a device which is not pre-configured in USE_AOT_DEVLIST, this application can still run because JIT compilation will be triggered automatically to allow execution on the current device. It causes additional compilation time during execution.

For more GPU platforms, please refer to Use AOT for Integrated Graphics (Intel GPU).

Requirement

Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver must be installed beforehand to use AOT compilation. Once USE_AOT_DEVLIST is configured, Intel® Extension for PyTorch* will provide -fsycl-targets=spir64_gen option and -Xs "-device ${USE_AOT_DEVLIST}" option for generating binaries that utilize Intel® oneAPI Level Zero backend.