ITT annotations support¶
This extension enables a set of functions implementing the Instrumentation and Tracing Technology (ITT) functionality in SYCL device code.
There are three sets of functions defined by this extension, and they serve different purposes.
User APIs¶
The user code calling these functions must include the corresponding header
file(s) provided by ittnotify
project (TBD: reference ITT repo here).
These functions are named using __itt_notify_
prefix.
Stub APIs¶
These functions are not defined in any header file, and their declarations
follow exactly the declarations of the corresponding user APIs, except that
they have an extra _stub
suffix in their names.
These functions implement the ITT functionality in a way that allows the tools, such as Intel(R) Inspector, to recognize the ITT annotations and run their analysis methods based on that.
For SYCL device code these functions are implemented as noinline
and
optnone
functions so that the corresponding calls may be distinguished
in the execution trace. This is just one way for implementing them,
and the actual implementation may change in future.
Compiler wrapper APIs¶
These functions are not defined in any header file, and they are supposed to be called from the compiler generated code. These thin wrappers just provide a convenient way for compilers to produce ITT annotations without generating too much code in the compilers’ IR.
These functions have _wrapper
suffix in their names.
Example
DEVICE_EXTERN_C void __itt_offload_wi_start_stub(
size_t[3], size_t, uint32_t);
DEVICE_EXTERN_C void __itt_offload_wi_start_wrapper() {
if (__spirv_SpecConstant(0xFF747469, 0)) {
size_t GroupID[3] = ...;
size_t WIId = ...;
uint32_t WGSize = ...;
__itt_offload_wi_start_stub(GroupID, WIId, WGSize);
}
}
A compiler may generate a simple call to __itt_offload_wi_start_wrapper
to annotate a kernel entry point. Compare this to the code inside the wrapper
function, which a compiler would have to generate if there were no such
a wrapper.
Conditional compilation¶
DPC++ compiler automatically instruments user code through SPIRITTAnnotations LLVM pass, which is enabled for targets, that natively support specialization constants (i.e., SPIR-V targets). Annotations are generated for barriers, atomics, work item start and finish. To minimize the effect of ITT annotations on the performance of the device code, the implementation is guarded with a specialization constant check. This allows users and tools to have one version of the annotated code that may be built with and without ITT annotations “enabled”. When the ITT annotations are not enabled, we expect that the overall effect of the annotations will be minimized by the dead code elimination optimization(s) made by the device compilers.
For this purpose we reserve a 1-byte specialization constant numbered
4285822057
(0xFF747469
). The users/tools/runtimes should set this
specialization constant to non-zero value to enable the ITT annotations
in SYCL device code.
The specialization constant value is controlled by INTEL_ENABLE_OFFLOAD_ANNOTATIONS environment variable. Tools, that support ITT annotations must set this environment variable to any value.