Code Samples#

Warning

DRAFT DOCUMENTATION - This documentation is currently in draft status and subject to change.

Sample applications demonstrate how to use PTI SDK tracing and profiling capabilities. All samples are located in the samples/ directory of the repository.

Repository Location:

ptiView API Samples#

Demonstrates how to use the ptiView API for tracing GPU operations, including kernel execution, memory copies, and API calls.

Sample

Description

PTI_VIEW_...

vector_sq_add

SYCL vector square-add kernel

DEVICE_GPU_KERNEL, DEVICE_GPU_MEM_COPY, DEVICE_GPU_MEM_FILL, RUNTIME_API, EXTERNAL_CORRELATION

dpc_gemm

DPC++ matrix multiplication with comprehensive tracing

DEVICE_GPU_KERNEL, DEVICE_GPU_MEM_COPY, DEVICE_GPU_MEM_FILL, RUNTIME_API, DRIVER_API, EXTERNAL_CORRELATION, COLLECTION_OVERHEAD

onemkl_gemm

Matrix multiplication using Intel(R) oneMKL library

DEVICE_GPU_KERNEL, DEVICE_GPU_MEM_COPY, DEVICE_GPU_MEM_FILL, RUNTIME_API

itt_ccl

Intel(R) oneCCL library operations tracing (Linux only)

COMMUNICATION

dpc_gemm_threaded

Multi-threaded DPC++ matrix multiplication

DEVICE_GPU_KERNEL, DEVICE_GPU_MEM_COPY, DEVICE_GPU_MEM_FILL, RUNTIME_API, DRIVER_API, COLLECTION_OVERHEAD

omp_vec_add

C-language OpenMP sample with GPU offload

DEVICE_GPU_KERNEL, DEVICE_GPU_MEM_COPY, DEVICE_GPU_MEM_FILL, DRIVER_API

view_record_versioned

Version-aware tracing that handles v1 and v2 view record structures using ptiVersion()

DEVICE_GPU_KERNEL, DEVICE_GPU_MEM_COPY, DEVICE_GPU_MEM_FILL, RUNTIME_API, DRIVER_API, COLLECTION_OVERHEAD

ptiCallback API Sample#

callback

Demonstrates the experimental ptiCallback API for synchronous GPU operation notifications.

Warning

The ptiCallback API is experimental and subject to change. See PTI Callback API Reference (Experimental) for documentation.

GPU Hardware Metrics Collection Samples#

metrics_scope

Per-kernel hardware metrics collection using ptiMetricsScope API.

metrics_perf

Device-level time-based metrics sampling using the ptiMetrics API. Use the --list-metrics option to output all available metrics sets and metrics on the system GPUs.

metrics_iso3dfd_dpcpp

Hardware metrics collection in ISO3DFD stencil computation workload.

Building Samples#

Samples are built automatically with PTI SDK. See Building and Installing for build instructions.

From the SDK build directory:

Linux:

./build/bin/vector_sq_add
./build/bin/dpc_gemm
./build/bin/metrics_scope

Windows:

build\bin\vector_sq_add.exe
build\bin\dpc_gemm.exe
build\bin\metrics_scope.exe

Next Steps#