DPC++ Runtime
Runtime libraries for oneAPI Data Parallel C++
_pi_kernel Struct Reference

Implementation of a PI Kernel for CUDA. More...

#include <cuda/pi_cuda.hpp>

Collaboration diagram for _pi_kernel:

Classes

struct  arguments
 Structure that holds the arguments to the kernel. More...
 
struct  Hash
 

Public Types

using native_type = CUfunction
 
using native_type = hipFunction_t
 

Public Member Functions

 _pi_kernel (CUfunction func, CUfunction funcWithOffsetParam, const char *name, pi_program program, pi_context ctxt)
 
 _pi_kernel (CUfunction func, const char *name, pi_program program, pi_context ctxt)
 
 ~_pi_kernel ()
 
pi_program get_program () const noexcept
 
pi_uint32 increment_reference_count () noexcept
 
pi_uint32 decrement_reference_count () noexcept
 
pi_uint32 get_reference_count () const noexcept
 
native_type get () const noexcept
 
native_type get_with_offset_parameter () const noexcept
 
bool has_with_offset_parameter () const noexcept
 
pi_context get_context () const noexcept
 
const charget_name () const noexcept
 
pi_uint32 get_num_args () const noexcept
 Returns the number of arguments, excluding the implicit global offset. More...
 
void set_kernel_arg (int index, size_t size, const void *arg)
 
void set_kernel_local_arg (int index, size_t size)
 
void set_implicit_offset_arg (size_t size, std::uint32_t *implicitOffset)
 
const arguments::args_index_tget_arg_indices () const
 
pi_uint32 get_local_size () const noexcept
 
void clear_local_size ()
 
 _pi_kernel ()
 
 _pi_kernel (hipFunction_t func, hipFunction_t funcWithOffsetParam, const char *name, pi_program program, pi_context ctxt)
 
 _pi_kernel (hipFunction_t func, const char *name, pi_program program, pi_context ctxt)
 
 ~_pi_kernel ()
 
pi_program get_program () const noexcept
 
pi_uint32 increment_reference_count () noexcept
 
pi_uint32 decrement_reference_count () noexcept
 
pi_uint32 get_reference_count () const noexcept
 
native_type get () const noexcept
 
native_type get_with_offset_parameter () const noexcept
 
bool has_with_offset_parameter () const noexcept
 
pi_context get_context () const noexcept
 
const charget_name () const noexcept
 
pi_uint32 get_num_args () const noexcept
 Returns the number of arguments, excluding the implicit global offset. More...
 
void set_kernel_arg (int index, size_t size, const void *arg)
 
void set_kernel_local_arg (int index, size_t size)
 
void set_implicit_offset_arg (size_t size, std::uint32_t *implicitOffset)
 
arguments::args_index_t get_arg_indices () const
 
pi_uint32 get_local_size () const noexcept
 
void clear_local_size ()
 
 _pi_kernel (ze_kernel_handle_t Kernel, bool OwnZeKernel, pi_program Program)
 
bool hasIndirectAccess ()
 

Public Attributes

native_type function_
 
native_type functionWithOffsetParam_
 
std::string name_
 
pi_context context_
 
pi_program program_
 
std::atomic_uint32_t refCount_
 
size_t reqdThreadsPerBlock_ [REQD_THREADS_PER_BLOCK_DIMENSIONS]
 
struct _pi_kernel::arguments args_
 
ze_kernel_handle_t ZeKernel
 
bool OwnZeKernel
 
pi_program Program
 
std::unordered_set< std::pair< void *const, MemAllocRecord > *, HashMemAllocs
 
std::atomic< pi_uint32SubmissionsCount
 
ZeCache< ZeStruct< ze_kernel_properties_t > > ZeKernelProperties
 

Static Public Attributes

static constexpr pi_uint32 REQD_THREADS_PER_BLOCK_DIMENSIONS = 3u
 

Detailed Description

Implementation of a PI Kernel for CUDA.

Implementation of a PI Kernel for HIP.

PI Kernels are used to set kernel arguments, creating a state on the Kernel object for a given invocation. This is not the case of CUFunction objects, which are simply passed together with the arguments on the invocation. The PI Kernel implementation for CUDA stores the list of arguments, argument sizes and offsets to emulate the interface of PI Kernel, saving the arguments for the later dispatch. Note that in PI API, the Local memory is specified as a size per individual argument, but in CUDA only the total usage of shared memory is required since it is not passed as a parameter. A compiler pass converts the PI API local memory model into the CUDA shared model. This object simply calculates the total of shared memory, and the initial offsets of each parameter.

PI Kernels are used to set kernel arguments, creating a state on the Kernel object for a given invocation. This is not the case of HIPFunction objects, which are simply passed together with the arguments on the invocation. The PI Kernel implementation for HIP stores the list of arguments, argument sizes and offsets to emulate the interface of PI Kernel, saving the arguments for the later dispatch. Note that in PI API, the Local memory is specified as a size per individual argument, but in HIP only the total usage of shared memory is required since it is not passed as a parameter. A compiler pass converts the PI API local memory model into the HIP shared model. This object simply calculates the total of shared memory, and the initial offsets of each parameter.

Definition at line 578 of file pi_cuda.hpp.

Member Typedef Documentation

◆ native_type [1/2]

using _pi_kernel::native_type = hipFunction_t

Definition at line 556 of file pi_hip.hpp.

◆ native_type [2/2]

using _pi_kernel::native_type = CUfunction

Definition at line 579 of file pi_cuda.hpp.

Constructor & Destructor Documentation

◆ _pi_kernel() [1/6]

_pi_kernel::_pi_kernel ( CUfunction  func,
CUfunction  funcWithOffsetParam,
const char name,
pi_program  program,
pi_context  ctxt 
)
inline

Note: this code assumes that there is only one device per context

Definition at line 661 of file pi_cuda.hpp.

◆ _pi_kernel() [2/6]

_pi_kernel::_pi_kernel ( CUfunction  func,
const char name,
pi_program  program,
pi_context  ctxt 
)
inline

Note: this code assumes that there is only one device per context

Definition at line 674 of file pi_cuda.hpp.

◆ ~_pi_kernel() [1/2]

_pi_kernel::~_pi_kernel ( )
inline

Definition at line 684 of file pi_cuda.hpp.

References context_, cuda_piContextRelease(), cuda_piProgramRelease(), and program_.

◆ _pi_kernel() [3/6]

_pi_kernel::_pi_kernel ( )
inline

Definition at line 160 of file pi_esimd_emulator.hpp.

◆ _pi_kernel() [4/6]

_pi_kernel::_pi_kernel ( hipFunction_t  func,
hipFunction_t  funcWithOffsetParam,
const char name,
pi_program  program,
pi_context  ctxt 
)
inline

Definition at line 635 of file pi_hip.hpp.

◆ _pi_kernel() [5/6]

_pi_kernel::_pi_kernel ( hipFunction_t  func,
const char name,
pi_program  program,
pi_context  ctxt 
)
inline

Definition at line 643 of file pi_hip.hpp.

◆ ~_pi_kernel() [2/2]

_pi_kernel::~_pi_kernel ( )
inline

Definition at line 647 of file pi_hip.hpp.

References context_, hip_piContextRelease(), hip_piProgramRelease(), and program_.

◆ _pi_kernel() [6/6]

_pi_kernel::_pi_kernel ( ze_kernel_handle_t  Kernel,
bool  OwnZeKernel,
pi_program  Program 
)
inline

Definition at line 1141 of file pi_level_zero.hpp.

Member Function Documentation

◆ clear_local_size() [1/2]

void _pi_kernel::clear_local_size ( )
inline

Definition at line 698 of file pi_hip.hpp.

References args_, and _pi_kernel::arguments::clear_local_size().

◆ clear_local_size() [2/2]

void _pi_kernel::clear_local_size ( )
inline

Definition at line 736 of file pi_cuda.hpp.

References args_, and _pi_kernel::arguments::clear_local_size().

◆ decrement_reference_count() [1/2]

pi_uint32 _pi_kernel::decrement_reference_count ( )
inlinenoexcept

Definition at line 656 of file pi_hip.hpp.

References refCount_.

◆ decrement_reference_count() [2/2]

pi_uint32 _pi_kernel::decrement_reference_count ( )
inlinenoexcept

Definition at line 694 of file pi_cuda.hpp.

References refCount_.

◆ get() [1/2]

native_type _pi_kernel::get ( ) const
inlinenoexcept

Definition at line 660 of file pi_hip.hpp.

References function_.

◆ get() [2/2]

native_type _pi_kernel::get ( ) const
inlinenoexcept

Definition at line 698 of file pi_cuda.hpp.

References function_.

◆ get_arg_indices() [1/2]

arguments::args_index_t _pi_kernel::get_arg_indices ( ) const
inline

Definition at line 692 of file pi_hip.hpp.

References args_, and _pi_kernel::arguments::get_indices().

◆ get_arg_indices() [2/2]

const arguments::args_index_t& _pi_kernel::get_arg_indices ( ) const
inline

Definition at line 730 of file pi_cuda.hpp.

References args_, and _pi_kernel::arguments::get_indices().

◆ get_context() [1/2]

pi_context _pi_kernel::get_context ( ) const
inlinenoexcept

Definition at line 670 of file pi_hip.hpp.

References context_.

◆ get_context() [2/2]

pi_context _pi_kernel::get_context ( ) const
inlinenoexcept

Definition at line 708 of file pi_cuda.hpp.

References context_.

◆ get_local_size() [1/2]

pi_uint32 _pi_kernel::get_local_size ( ) const
inlinenoexcept

Definition at line 696 of file pi_hip.hpp.

References args_, and _pi_kernel::arguments::get_local_size().

◆ get_local_size() [2/2]

pi_uint32 _pi_kernel::get_local_size ( ) const
inlinenoexcept

Definition at line 734 of file pi_cuda.hpp.

References args_, and _pi_kernel::arguments::get_local_size().

◆ get_name() [1/2]

const char* _pi_kernel::get_name ( ) const
inlinenoexcept

Definition at line 672 of file pi_hip.hpp.

References name_.

◆ get_name() [2/2]

const char* _pi_kernel::get_name ( ) const
inlinenoexcept

Definition at line 710 of file pi_cuda.hpp.

References name_.

◆ get_num_args() [1/2]

pi_uint32 _pi_kernel::get_num_args ( ) const
inlinenoexcept

Returns the number of arguments, excluding the implicit global offset.

Note this only returns the current known number of arguments, not the real one required by the kernel, since this cannot be queried from the HIP Driver API

Definition at line 678 of file pi_hip.hpp.

References args_, and _pi_kernel::arguments::indices_.

◆ get_num_args() [2/2]

pi_uint32 _pi_kernel::get_num_args ( ) const
inlinenoexcept

Returns the number of arguments, excluding the implicit global offset.

Note this only returns the current known number of arguments, not the real one required by the kernel, since this cannot be queried from the CUDA Driver API

Definition at line 716 of file pi_cuda.hpp.

References args_, and _pi_kernel::arguments::indices_.

◆ get_program() [1/2]

pi_program _pi_kernel::get_program ( ) const
inlinenoexcept

Definition at line 652 of file pi_hip.hpp.

References program_.

◆ get_program() [2/2]

pi_program _pi_kernel::get_program ( ) const
inlinenoexcept

Definition at line 690 of file pi_cuda.hpp.

References program_.

◆ get_reference_count() [1/2]

pi_uint32 _pi_kernel::get_reference_count ( ) const
inlinenoexcept

Definition at line 658 of file pi_hip.hpp.

References refCount_.

◆ get_reference_count() [2/2]

pi_uint32 _pi_kernel::get_reference_count ( ) const
inlinenoexcept

Definition at line 696 of file pi_cuda.hpp.

References refCount_.

◆ get_with_offset_parameter() [1/2]

native_type _pi_kernel::get_with_offset_parameter ( ) const
inlinenoexcept

Definition at line 662 of file pi_hip.hpp.

References functionWithOffsetParam_.

◆ get_with_offset_parameter() [2/2]

native_type _pi_kernel::get_with_offset_parameter ( ) const
inlinenoexcept

Definition at line 700 of file pi_cuda.hpp.

References functionWithOffsetParam_.

◆ has_with_offset_parameter() [1/2]

bool _pi_kernel::has_with_offset_parameter ( ) const
inlinenoexcept

Definition at line 666 of file pi_hip.hpp.

References functionWithOffsetParam_.

◆ has_with_offset_parameter() [2/2]

bool _pi_kernel::has_with_offset_parameter ( ) const
inlinenoexcept

Definition at line 704 of file pi_cuda.hpp.

References functionWithOffsetParam_.

◆ hasIndirectAccess()

bool _pi_kernel::hasIndirectAccess ( )
inline

Definition at line 1146 of file pi_level_zero.hpp.

◆ increment_reference_count() [1/2]

pi_uint32 _pi_kernel::increment_reference_count ( )
inlinenoexcept

Definition at line 654 of file pi_hip.hpp.

References refCount_.

◆ increment_reference_count() [2/2]

pi_uint32 _pi_kernel::increment_reference_count ( )
inlinenoexcept

Definition at line 692 of file pi_cuda.hpp.

References refCount_.

◆ set_implicit_offset_arg() [1/2]

void _pi_kernel::set_implicit_offset_arg ( size_t  size,
std::uint32_t *  implicitOffset 
)
inline

Definition at line 688 of file pi_hip.hpp.

References args_, and _pi_kernel::arguments::set_implicit_offset().

◆ set_implicit_offset_arg() [2/2]

void _pi_kernel::set_implicit_offset_arg ( size_t  size,
std::uint32_t *  implicitOffset 
)
inline

Definition at line 726 of file pi_cuda.hpp.

References args_, and _pi_kernel::arguments::set_implicit_offset().

◆ set_kernel_arg() [1/2]

void _pi_kernel::set_kernel_arg ( int  index,
size_t  size,
const void *  arg 
)
inline

Definition at line 680 of file pi_hip.hpp.

References _pi_kernel::arguments::add_arg(), and args_.

◆ set_kernel_arg() [2/2]

void _pi_kernel::set_kernel_arg ( int  index,
size_t  size,
const void *  arg 
)
inline

Definition at line 718 of file pi_cuda.hpp.

References _pi_kernel::arguments::add_arg(), and args_.

◆ set_kernel_local_arg() [1/2]

void _pi_kernel::set_kernel_local_arg ( int  index,
size_t  size 
)
inline

Definition at line 684 of file pi_hip.hpp.

References _pi_kernel::arguments::add_local_arg(), and args_.

◆ set_kernel_local_arg() [2/2]

void _pi_kernel::set_kernel_local_arg ( int  index,
size_t  size 
)
inline

Definition at line 722 of file pi_cuda.hpp.

References _pi_kernel::arguments::add_local_arg(), and args_.

Member Data Documentation

◆ args_

◆ context_

pi_context _pi_kernel::context_

Definition at line 584 of file pi_cuda.hpp.

Referenced by get_context(), and ~_pi_kernel().

◆ function_

native_type _pi_kernel::function_

Definition at line 581 of file pi_cuda.hpp.

Referenced by get().

◆ functionWithOffsetParam_

native_type _pi_kernel::functionWithOffsetParam_

Definition at line 582 of file pi_cuda.hpp.

Referenced by get_with_offset_parameter(), and has_with_offset_parameter().

◆ MemAllocs

std::unordered_set<std::pair<void *const, MemAllocRecord> *, Hash> _pi_kernel::MemAllocs

Definition at line 1184 of file pi_level_zero.hpp.

Referenced by piKernelRelease().

◆ name_

std::string _pi_kernel::name_

Definition at line 583 of file pi_cuda.hpp.

Referenced by get_name().

◆ OwnZeKernel

bool _pi_kernel::OwnZeKernel

Definition at line 1157 of file pi_level_zero.hpp.

Referenced by piKernelRelease().

◆ Program

pi_program _pi_kernel::Program

Definition at line 1160 of file pi_level_zero.hpp.

Referenced by piKernelGetInfo(), piKernelRelease(), and piKernelRetain().

◆ program_

pi_program _pi_kernel::program_

Definition at line 585 of file pi_cuda.hpp.

Referenced by get_program(), and ~_pi_kernel().

◆ refCount_

std::atomic_uint32_t _pi_kernel::refCount_

◆ REQD_THREADS_PER_BLOCK_DIMENSIONS

constexpr pi_uint32 _pi_kernel::REQD_THREADS_PER_BLOCK_DIMENSIONS = 3u
staticconstexpr

Definition at line 588 of file pi_cuda.hpp.

◆ reqdThreadsPerBlock_

size_t _pi_kernel::reqdThreadsPerBlock_[REQD_THREADS_PER_BLOCK_DIMENSIONS]

Definition at line 589 of file pi_cuda.hpp.

◆ SubmissionsCount

std::atomic<pi_uint32> _pi_kernel::SubmissionsCount

Definition at line 1195 of file pi_level_zero.hpp.

Referenced by piKernelRelease().

◆ ZeKernel

◆ ZeKernelProperties

ZeCache<ZeStruct<ze_kernel_properties_t> > _pi_kernel::ZeKernelProperties

The documentation for this struct was generated from the following files: