DPC++ Runtime
Runtime libraries for oneAPI DPC++
syclcompat::experimental Namespace Reference

Classes

class  logical_group
 The logical-group is a logical collection of some work-items within a work-group. More...
 

Functions

template<typename T >
select_from_sub_group (unsigned int member_mask, sycl::sub_group g, T x, int remote_local_id, int logical_sub_group_size=32)
 Masked version of select_from_sub_group, which execute masked sub-group operation. More...
 
template<typename T >
shift_sub_group_left (unsigned int member_mask, sycl::sub_group g, T x, unsigned int delta, int logical_sub_group_size=32)
 Masked version of shift_sub_group_left, which execute masked sub-group operation. More...
 
template<typename T >
shift_sub_group_right (unsigned int member_mask, sycl::sub_group g, T x, unsigned int delta, int logical_sub_group_size=32)
 Masked version of shift_sub_group_right, which execute masked sub-group operation. More...
 
template<typename T >
permute_sub_group_by_xor (unsigned int member_mask, sycl::sub_group g, T x, unsigned int mask, int logical_sub_group_size=32)
 Masked version of permute_sub_group_by_xor, which execute masked sub-group operation. More...
 
template<int dimensions = 3>
void nd_range_barrier (sycl::nd_item< dimensions > item, sycl::atomic_ref< unsigned int, sycl::memory_order::acq_rel, sycl::memory_scope::device, sycl::access::address_space::global_space > &counter)
 Synchronize work items from all work groups within a SYCL kernel. More...
 
template<>
void nd_range_barrier (sycl::nd_item< 1 > item, sycl::atomic_ref< unsigned int, sycl::memory_order::acq_rel, sycl::memory_scope::device, sycl::access::address_space::global_space > &counter)
 Synchronize work items from all work groups within a SYCL kernel. More...
 
int calculate_max_active_wg_per_xecore (int *num_wg, int wg_size, int slm_size=0, int sg_size=32, bool used_barrier=false, bool used_large_grf=false)
 This function is used for occupancy calculation, it computes the max active work-group number per Xe-Core. More...
 
int calculate_max_potential_wg (int *num_wg, int *wg_size, int max_wg_size_for_device_code, int slm_size=0, int sg_size=32, bool used_barrier=false, bool used_large_grf=false)
 This function is used for occupancy calculation, it computes the work-group number and the work-group size which achieves the maximum occupancy of the device potentially. More...
 

Function Documentation

◆ calculate_max_active_wg_per_xecore()

int syclcompat::experimental::calculate_max_active_wg_per_xecore ( int *  num_wg,
int  wg_size,
int  slm_size = 0,
int  sg_size = 32,
bool  used_barrier = false,
bool  used_large_grf = false 
)
inline

This function is used for occupancy calculation, it computes the max active work-group number per Xe-Core.

Ref to https://github.com/oneapi-src/oneAPI-samples/tree/master/Tools/GPU-Occupancy-Calculator

Parameters
[out]num_wgActive work-group number.
[in]wg_sizeWork-group size.
[in]slm_sizeShare local memory size.
[in]sg_sizeSub-group size.
[in]used_barrierWhether barrier is used.
[in]used_large_grfWhether large General Register File is used.
Returns
If no error, returns 0. If wg_size exceeds the max work-group size, the max work-group size will be used instead of wg_size and returns -1.

Definition at line 655 of file util.hpp.

References syclcompat::get_current_device().

Referenced by calculate_max_potential_wg().

◆ calculate_max_potential_wg()

int syclcompat::experimental::calculate_max_potential_wg ( int *  num_wg,
int *  wg_size,
int  max_wg_size_for_device_code,
int  slm_size = 0,
int  sg_size = 32,
bool  used_barrier = false,
bool  used_large_grf = false 
)
inline

This function is used for occupancy calculation, it computes the work-group number and the work-group size which achieves the maximum occupancy of the device potentially.

Ref to https://github.com/oneapi-src/oneAPI-samples/tree/master/Tools/GPU-Occupancy-Calculator

Parameters
[out]num_wgWork-group number.
[out]wg_sizeWork-group size.
[in]max_wg_size_for_device_codeThe maximum working work-group size for current device code logic. Zero means no limitation.
[in]slm_sizeShare local memory size.
[in]sg_sizeSub-group size.
[in]used_barrierWhether barrier is used.
[in]used_large_grfWhether large General Register File is used.
Returns
Returns 0.

Definition at line 720 of file util.hpp.

References calculate_max_active_wg_per_xecore(), syclcompat::get_current_device(), sycl::_V1::device::get_info(), and sycl::_V1::device::has().

◆ nd_range_barrier() [1/2]

template<>
void syclcompat::experimental::nd_range_barrier ( sycl::nd_item< 1 >  item,
sycl::atomic_ref< unsigned int, sycl::memory_order::acq_rel, sycl::memory_scope::device, sycl::access::address_space::global_space > &  counter 
)
inline

Synchronize work items from all work groups within a SYCL kernel.

Parameters
[in]itemRepresents a work group.
[in]counterAn atomic object defined on a device memory which can be accessed by work items in all work groups. The initial value of the counter should be zero. Note: Please make sure that all the work items of all work groups within a SYCL kernel can be scheduled actively at the same time on a device.

Definition at line 544 of file util.hpp.

References sycl::_V1::nd_item< Dimensions >::barrier(), sycl::_V1::nd_item< Dimensions >::get_group(), sycl::_V1::nd_item< Dimensions >::get_group_range(), and sycl::_V1::nd_item< Dimensions >::get_local_linear_id().

◆ nd_range_barrier() [2/2]

template<int dimensions = 3>
void syclcompat::experimental::nd_range_barrier ( sycl::nd_item< dimensions >  item,
sycl::atomic_ref< unsigned int, sycl::memory_order::acq_rel, sycl::memory_scope::device, sycl::access::address_space::global_space > &  counter 
)
inline

Synchronize work items from all work groups within a SYCL kernel.

Parameters
[in]itemRepresents a work group.
[in]counterAn atomic object defined on a device memory which can be accessed by work items in all work groups. The initial value of the counter should be zero. Note: Please make sure that all the work items of all work groups within a SYCL kernel can be scheduled actively at the same time on a device.

Definition at line 504 of file util.hpp.

References sycl::_V1::nd_item< Dimensions >::barrier(), sycl::_V1::nd_item< Dimensions >::get_group(), sycl::_V1::nd_item< Dimensions >::get_group_range(), and sycl::_V1::nd_item< Dimensions >::get_local_linear_id().

◆ permute_sub_group_by_xor()

template<typename T >
T syclcompat::experimental::permute_sub_group_by_xor ( unsigned int  member_mask,
sycl::sub_group  g,
x,
unsigned int  mask,
int  logical_sub_group_size = 32 
)

Masked version of permute_sub_group_by_xor, which execute masked sub-group operation.

The parameter member_mask indicating the work-items participating the call. Whether the n-th bit is set to 1 representing whether the work-item with id n is participating the call. All work-items named in member_mask must be executed with the same member_mask, or the result is undefined.

Template Parameters
TInput value type
Parameters
[in]member_maskInput mask
[in]gInput sub_group
[in]xInput value
[in]maskInput mask
[in]logical_sub_group_sizeInput logical sub_group size
Returns
The result

Definition at line 449 of file util.hpp.

References sycl::_V1::sub_group::get_local_linear_id(), and __spv::Scope::Subgroup.

◆ select_from_sub_group()

template<typename T >
T syclcompat::experimental::select_from_sub_group ( unsigned int  member_mask,
sycl::sub_group  g,
x,
int  remote_local_id,
int  logical_sub_group_size = 32 
)

Masked version of select_from_sub_group, which execute masked sub-group operation.

The parameter member_mask indicating the work-items participating the call. Whether the n-th bit is set to 1 representing whether the work-item with id n is participating the call. All work-items named in member_mask must be executed with the same member_mask, or the result is undefined.

Template Parameters
TInput value type
Parameters
[in]member_maskInput mask
[in]gInput sub_group
[in]xInput value
[in]remote_local_idInput source work item id
[in]logical_sub_group_sizeInput logical sub_group size
Returns
The result

Definition at line 310 of file util.hpp.

References sycl::_V1::sub_group::get_local_linear_id(), and __spv::Scope::Subgroup.

◆ shift_sub_group_left()

template<typename T >
T syclcompat::experimental::shift_sub_group_left ( unsigned int  member_mask,
sycl::sub_group  g,
x,
unsigned int  delta,
int  logical_sub_group_size = 32 
)

Masked version of shift_sub_group_left, which execute masked sub-group operation.

The parameter member_mask indicating the work-items participating the call. Whether the n-th bit is set to 1 representing whether the work-item with id n is participating the call. All work-items named in member_mask must be executed with the same member_mask, or the result is undefined.

Template Parameters
TInput value type
Parameters
[in]member_maskInput mask
[in]gInput sub_group
[in]xInput value
[in]deltaInput delta
[in]logical_sub_group_sizeInput logical sub_group size
Returns
The result

Definition at line 355 of file util.hpp.

References sycl::_V1::sub_group::get_local_linear_id(), and __spv::Scope::Subgroup.

◆ shift_sub_group_right()

template<typename T >
T syclcompat::experimental::shift_sub_group_right ( unsigned int  member_mask,
sycl::sub_group  g,
x,
unsigned int  delta,
int  logical_sub_group_size = 32 
)

Masked version of shift_sub_group_right, which execute masked sub-group operation.

The parameter member_mask indicating the work-items participating the call. Whether the n-th bit is set to 1 representing whether the work-item with id n is participating the call. All work-items named in member_mask must be executed with the same member_mask, or the result is undefined.

Template Parameters
TInput value type
Parameters
[in]member_maskInput mask
[in]gInput sub_group
[in]xInput value
[in]deltaInput delta
[in]logical_sub_group_sizeInput logical sub_group size
Returns
The result

Definition at line 403 of file util.hpp.

References sycl::_V1::sub_group::get_local_linear_id(), and __spv::Scope::Subgroup.