DPC++ Runtime
Runtime libraries for oneAPI DPC++
_pi_kernel Struct Reference

Implementation of a PI Kernel for CUDA. More...

#include <cuda/pi_cuda.hpp>

Collaboration diagram for _pi_kernel:

Classes

struct  arguments
 Structure that holds the arguments to the kernel. More...
 

Public Types

using native_type = CUfunction
 
using native_type = hipFunction_t
 

Public Member Functions

 _pi_kernel (CUfunction func, CUfunction funcWithOffsetParam, const char *name, pi_program program, pi_context ctxt)
 
 ~_pi_kernel ()
 
pi_program get_program () const noexcept
 
pi_uint32 increment_reference_count () noexcept
 
pi_uint32 decrement_reference_count () noexcept
 
pi_uint32 get_reference_count () const noexcept
 
native_type get () const noexcept
 
native_type get_with_offset_parameter () const noexcept
 
bool has_with_offset_parameter () const noexcept
 
pi_context get_context () const noexcept
 
const char * get_name () const noexcept
 
pi_uint32 get_num_args () const noexcept
 Returns the number of arguments, excluding the implicit global offset. More...
 
void set_kernel_arg (int index, size_t size, const void *arg)
 
void set_kernel_local_arg (int index, size_t size)
 
void set_implicit_offset_arg (size_t size, std::uint32_t *implicitOffset)
 
const arguments::args_index_tget_arg_indices () const
 
pi_uint32 get_local_size () const noexcept
 
void clear_local_size ()
 
 _pi_kernel ()
 
 _pi_kernel (hipFunction_t func, hipFunction_t funcWithOffsetParam, const char *name, pi_program program, pi_context ctxt)
 
 _pi_kernel (hipFunction_t func, const char *name, pi_program program, pi_context ctxt)
 
 ~_pi_kernel ()
 
pi_program get_program () const noexcept
 
pi_uint32 increment_reference_count () noexcept
 
pi_uint32 decrement_reference_count () noexcept
 
pi_uint32 get_reference_count () const noexcept
 
native_type get () const noexcept
 
native_type get_with_offset_parameter () const noexcept
 
bool has_with_offset_parameter () const noexcept
 
pi_context get_context () const noexcept
 
const char * get_name () const noexcept
 
pi_uint32 get_num_args () const noexcept
 Returns the number of arguments, excluding the implicit global offset. More...
 
void set_kernel_arg (int index, size_t size, const void *arg)
 
void set_kernel_local_arg (int index, size_t size)
 
void set_implicit_offset_arg (size_t size, std::uint32_t *implicitOffset)
 
arguments::args_index_t get_arg_indices () const
 
pi_uint32 get_local_size () const noexcept
 
void clear_local_size ()
 

Public Attributes

native_type function_
 
native_type functionWithOffsetParam_
 
std::string name_
 
pi_context context_
 
pi_program program_
 
std::atomic_uint32_t refCount_
 
size_t reqdThreadsPerBlock_ [REQD_THREADS_PER_BLOCK_DIMENSIONS]
 
struct _pi_kernel::arguments args_
 

Static Public Attributes

static constexpr pi_uint32 REQD_THREADS_PER_BLOCK_DIMENSIONS = 3u
 

Detailed Description

Implementation of a PI Kernel for CUDA.

Implementation of a PI Kernel for HIP.

PI Kernels are used to set kernel arguments, creating a state on the Kernel object for a given invocation. This is not the case of CUFunction objects, which are simply passed together with the arguments on the invocation. The PI Kernel implementation for CUDA stores the list of arguments, argument sizes and offsets to emulate the interface of PI Kernel, saving the arguments for the later dispatch. Note that in PI API, the Local memory is specified as a size per individual argument, but in CUDA only the total usage of shared memory is required since it is not passed as a parameter. A compiler pass converts the PI API local memory model into the CUDA shared model. This object simply calculates the total of shared memory, and the initial offsets of each parameter.

PI Kernels are used to set kernel arguments, creating a state on the Kernel object for a given invocation. This is not the case of HIPFunction objects, which are simply passed together with the arguments on the invocation. The PI Kernel implementation for HIP stores the list of arguments, argument sizes and offsets to emulate the interface of PI Kernel, saving the arguments for the later dispatch. Note that in PI API, the Local memory is specified as a size per individual argument, but in HIP only the total usage of shared memory is required since it is not passed as a parameter. A compiler pass converts the PI API local memory model into the HIP shared model. This object simply calculates the total of shared memory, and the initial offsets of each parameter.

Definition at line 817 of file pi_cuda.hpp.

Member Typedef Documentation

◆ native_type [1/2]

using _pi_kernel::native_type = hipFunction_t

Definition at line 777 of file pi_hip.hpp.

◆ native_type [2/2]

using _pi_kernel::native_type = CUfunction

Definition at line 818 of file pi_cuda.hpp.

Constructor & Destructor Documentation

◆ _pi_kernel() [1/4]

_pi_kernel::_pi_kernel ( CUfunction  func,
CUfunction  funcWithOffsetParam,
const char *  name,
pi_program  program,
pi_context  ctxt 
)
inline

Note: this code assumes that there is only one device per context

Definition at line 915 of file pi_cuda.hpp.

◆ ~_pi_kernel() [1/2]

_pi_kernel::~_pi_kernel ( )
inline

Definition at line 929 of file pi_cuda.hpp.

References context_, cuda_piContextRelease(), cuda_piProgramRelease(), and program_.

◆ _pi_kernel() [2/4]

_pi_kernel::_pi_kernel ( )
inline

Definition at line 218 of file pi_esimd_emulator.hpp.

◆ _pi_kernel() [3/4]

_pi_kernel::_pi_kernel ( hipFunction_t  func,
hipFunction_t  funcWithOffsetParam,
const char *  name,
pi_program  program,
pi_context  ctxt 
)
inline

Definition at line 871 of file pi_hip.hpp.

◆ _pi_kernel() [4/4]

_pi_kernel::_pi_kernel ( hipFunction_t  func,
const char *  name,
pi_program  program,
pi_context  ctxt 
)
inline

Definition at line 879 of file pi_hip.hpp.

◆ ~_pi_kernel() [2/2]

_pi_kernel::~_pi_kernel ( )
inline

Definition at line 883 of file pi_hip.hpp.

References context_, hip_piContextRelease(), hip_piProgramRelease(), and program_.

Member Function Documentation

◆ clear_local_size() [1/2]

void _pi_kernel::clear_local_size ( )
inline

Definition at line 934 of file pi_hip.hpp.

References args_, and _pi_kernel::arguments::clear_local_size().

◆ clear_local_size() [2/2]

void _pi_kernel::clear_local_size ( )
inline

Definition at line 980 of file pi_cuda.hpp.

References args_, and _pi_kernel::arguments::clear_local_size().

◆ decrement_reference_count() [1/2]

pi_uint32 _pi_kernel::decrement_reference_count ( )
inlinenoexcept

Definition at line 892 of file pi_hip.hpp.

References refCount_.

◆ decrement_reference_count() [2/2]

pi_uint32 _pi_kernel::decrement_reference_count ( )
inlinenoexcept

Definition at line 938 of file pi_cuda.hpp.

References refCount_.

◆ get() [1/2]

native_type _pi_kernel::get ( ) const
inlinenoexcept

Definition at line 896 of file pi_hip.hpp.

References function_.

◆ get() [2/2]

native_type _pi_kernel::get ( ) const
inlinenoexcept

Definition at line 942 of file pi_cuda.hpp.

References function_.

◆ get_arg_indices() [1/2]

arguments::args_index_t _pi_kernel::get_arg_indices ( ) const
inline

Definition at line 928 of file pi_hip.hpp.

References args_, and _pi_kernel::arguments::get_indices().

◆ get_arg_indices() [2/2]

const arguments::args_index_t& _pi_kernel::get_arg_indices ( ) const
inline

Definition at line 974 of file pi_cuda.hpp.

References args_, and _pi_kernel::arguments::get_indices().

◆ get_context() [1/2]

pi_context _pi_kernel::get_context ( ) const
inlinenoexcept

Definition at line 906 of file pi_hip.hpp.

References context_.

◆ get_context() [2/2]

pi_context _pi_kernel::get_context ( ) const
inlinenoexcept

Definition at line 952 of file pi_cuda.hpp.

References context_.

◆ get_local_size() [1/2]

pi_uint32 _pi_kernel::get_local_size ( ) const
inlinenoexcept

Definition at line 932 of file pi_hip.hpp.

References args_, and _pi_kernel::arguments::get_local_size().

◆ get_local_size() [2/2]

pi_uint32 _pi_kernel::get_local_size ( ) const
inlinenoexcept

Definition at line 978 of file pi_cuda.hpp.

References args_, and _pi_kernel::arguments::get_local_size().

◆ get_name() [1/2]

const char* _pi_kernel::get_name ( ) const
inlinenoexcept

Definition at line 908 of file pi_hip.hpp.

References name_.

◆ get_name() [2/2]

const char* _pi_kernel::get_name ( ) const
inlinenoexcept

Definition at line 954 of file pi_cuda.hpp.

References name_.

◆ get_num_args() [1/2]

pi_uint32 _pi_kernel::get_num_args ( ) const
inlinenoexcept

Returns the number of arguments, excluding the implicit global offset.

Note this only returns the current known number of arguments, not the real one required by the kernel, since this cannot be queried from the HIP Driver API

Definition at line 914 of file pi_hip.hpp.

References args_, and _pi_kernel::arguments::indices_.

◆ get_num_args() [2/2]

pi_uint32 _pi_kernel::get_num_args ( ) const
inlinenoexcept

Returns the number of arguments, excluding the implicit global offset.

Note this only returns the current known number of arguments, not the real one required by the kernel, since this cannot be queried from the CUDA Driver API

Definition at line 960 of file pi_cuda.hpp.

References args_, and _pi_kernel::arguments::indices_.

◆ get_program() [1/2]

pi_program _pi_kernel::get_program ( ) const
inlinenoexcept

Definition at line 888 of file pi_hip.hpp.

References program_.

◆ get_program() [2/2]

pi_program _pi_kernel::get_program ( ) const
inlinenoexcept

Definition at line 934 of file pi_cuda.hpp.

References program_.

◆ get_reference_count() [1/2]

pi_uint32 _pi_kernel::get_reference_count ( ) const
inlinenoexcept

Definition at line 894 of file pi_hip.hpp.

References refCount_.

◆ get_reference_count() [2/2]

pi_uint32 _pi_kernel::get_reference_count ( ) const
inlinenoexcept

Definition at line 940 of file pi_cuda.hpp.

References refCount_.

◆ get_with_offset_parameter() [1/2]

native_type _pi_kernel::get_with_offset_parameter ( ) const
inlinenoexcept

Definition at line 898 of file pi_hip.hpp.

References functionWithOffsetParam_.

◆ get_with_offset_parameter() [2/2]

native_type _pi_kernel::get_with_offset_parameter ( ) const
inlinenoexcept

Definition at line 944 of file pi_cuda.hpp.

References functionWithOffsetParam_.

◆ has_with_offset_parameter() [1/2]

bool _pi_kernel::has_with_offset_parameter ( ) const
inlinenoexcept

Definition at line 902 of file pi_hip.hpp.

References functionWithOffsetParam_.

◆ has_with_offset_parameter() [2/2]

bool _pi_kernel::has_with_offset_parameter ( ) const
inlinenoexcept

Definition at line 948 of file pi_cuda.hpp.

References functionWithOffsetParam_.

◆ increment_reference_count() [1/2]

pi_uint32 _pi_kernel::increment_reference_count ( )
inlinenoexcept

Definition at line 890 of file pi_hip.hpp.

References refCount_.

◆ increment_reference_count() [2/2]

pi_uint32 _pi_kernel::increment_reference_count ( )
inlinenoexcept

Definition at line 936 of file pi_cuda.hpp.

References refCount_.

◆ set_implicit_offset_arg() [1/2]

void _pi_kernel::set_implicit_offset_arg ( size_t  size,
std::uint32_t *  implicitOffset 
)
inline

Definition at line 924 of file pi_hip.hpp.

References args_, and _pi_kernel::arguments::set_implicit_offset().

◆ set_implicit_offset_arg() [2/2]

void _pi_kernel::set_implicit_offset_arg ( size_t  size,
std::uint32_t *  implicitOffset 
)
inline

Definition at line 970 of file pi_cuda.hpp.

References args_, and _pi_kernel::arguments::set_implicit_offset().

◆ set_kernel_arg() [1/2]

void _pi_kernel::set_kernel_arg ( int  index,
size_t  size,
const void *  arg 
)
inline

Definition at line 916 of file pi_hip.hpp.

References _pi_kernel::arguments::add_arg(), and args_.

◆ set_kernel_arg() [2/2]

void _pi_kernel::set_kernel_arg ( int  index,
size_t  size,
const void *  arg 
)
inline

Definition at line 962 of file pi_cuda.hpp.

References _pi_kernel::arguments::add_arg(), and args_.

◆ set_kernel_local_arg() [1/2]

void _pi_kernel::set_kernel_local_arg ( int  index,
size_t  size 
)
inline

Definition at line 920 of file pi_hip.hpp.

References _pi_kernel::arguments::add_local_arg(), and args_.

◆ set_kernel_local_arg() [2/2]

void _pi_kernel::set_kernel_local_arg ( int  index,
size_t  size 
)
inline

Definition at line 966 of file pi_cuda.hpp.

References _pi_kernel::arguments::add_local_arg(), and args_.

Member Data Documentation

◆ args_

◆ context_

pi_context _pi_kernel::context_

Definition at line 823 of file pi_cuda.hpp.

Referenced by get_context(), and ~_pi_kernel().

◆ function_

native_type _pi_kernel::function_

Definition at line 820 of file pi_cuda.hpp.

Referenced by get().

◆ functionWithOffsetParam_

native_type _pi_kernel::functionWithOffsetParam_

Definition at line 821 of file pi_cuda.hpp.

Referenced by get_with_offset_parameter(), and has_with_offset_parameter().

◆ name_

std::string _pi_kernel::name_

Definition at line 822 of file pi_cuda.hpp.

Referenced by get_name().

◆ program_

pi_program _pi_kernel::program_

Definition at line 824 of file pi_cuda.hpp.

Referenced by get_program(), and ~_pi_kernel().

◆ refCount_

std::atomic_uint32_t _pi_kernel::refCount_

◆ REQD_THREADS_PER_BLOCK_DIMENSIONS

constexpr pi_uint32 _pi_kernel::REQD_THREADS_PER_BLOCK_DIMENSIONS = 3u
staticconstexpr

Definition at line 827 of file pi_cuda.hpp.

◆ reqdThreadsPerBlock_

size_t _pi_kernel::reqdThreadsPerBlock_[REQD_THREADS_PER_BLOCK_DIMENSIONS]

Definition at line 828 of file pi_cuda.hpp.


The documentation for this struct was generated from the following files: