XeTLA v0.3.6
IntelĀ® Xe Templates for Linear Algebra - API Definition Document
 
Loading...
Searching...
No Matches
gpu::xetla::kernel::dispatch_policy_int4_dequantize_kslicing< group_swizzle_policy_, num_global_kslicing_, num_local_kslicing_ > Struct Template Reference

4bit kslicing GEMM implementation. More...

#include <dispatch_policy.hpp>

Public Types

using group_swizzle_policy = group_swizzle_policy_
 

Static Public Attributes

static constexpr int num_global_kslicing = num_global_kslicing_
 
static constexpr int num_local_kslicing = num_local_kslicing_
 
static constexpr gpu_arch arch_tag = group_swizzle_policy::arch_tag
 

Detailed Description

template<typename group_swizzle_policy_, int num_global_kslicing_ = 1, int num_local_kslicing_ = 1>
struct gpu::xetla::kernel::dispatch_policy_int4_dequantize_kslicing< group_swizzle_policy_, num_global_kslicing_, num_local_kslicing_ >

4bit kslicing GEMM implementation.

A special GEMM implementation to increase the hardware occupancy by splitting the GEMM task along k dimension. It includes inter-group reduction (by using global atomic) and intra-group reduction (by using local memory for data exchange).

Note
The difference compare with dispatch_policy_kslicing is we will add additional handling for 4bit.
Template Parameters
num_global_kslicing_Is the k dim split ratio between groups.
num_local_kslicing_Is the k dim split ratio within a group.
arch_tag_Is the HW architecture.

Member Typedef Documentation

◆ group_swizzle_policy

template<typename group_swizzle_policy_ , int num_global_kslicing_ = 1, int num_local_kslicing_ = 1>
using gpu::xetla::kernel::dispatch_policy_int4_dequantize_kslicing< group_swizzle_policy_, num_global_kslicing_, num_local_kslicing_ >::group_swizzle_policy = group_swizzle_policy_

Member Data Documentation

◆ arch_tag

template<typename group_swizzle_policy_ , int num_global_kslicing_ = 1, int num_local_kslicing_ = 1>
constexpr gpu_arch gpu::xetla::kernel::dispatch_policy_int4_dequantize_kslicing< group_swizzle_policy_, num_global_kslicing_, num_local_kslicing_ >::arch_tag = group_swizzle_policy::arch_tag
staticconstexpr

◆ num_global_kslicing

template<typename group_swizzle_policy_ , int num_global_kslicing_ = 1, int num_local_kslicing_ = 1>
constexpr int gpu::xetla::kernel::dispatch_policy_int4_dequantize_kslicing< group_swizzle_policy_, num_global_kslicing_, num_local_kslicing_ >::num_global_kslicing = num_global_kslicing_
staticconstexpr

◆ num_local_kslicing

template<typename group_swizzle_policy_ , int num_global_kslicing_ = 1, int num_local_kslicing_ = 1>
constexpr int gpu::xetla::kernel::dispatch_policy_int4_dequantize_kslicing< group_swizzle_policy_, num_global_kslicing_, num_local_kslicing_ >::num_local_kslicing = num_local_kslicing_
staticconstexpr