We provide new interfaces for matrix muliply in this patch:
---— Error handling, matching OpenCL plugin semantics.
- A new class called joint_matrix is introduced, and the user needs to specify the type of the elements, sizes, and the memory layout.
- joint_matrix_load is used for loading data from main memory to tiles of AMX or kernel's local memory.
- joint_matrix_store is used for storing data tiles of AMX or kernel's local memory to main memory.
- joint_matrix_mad is used for the matrix multiply and add function. It performs the multiply operation on the matrices A and B, accumulates the result with C and returns the result.
The following operation can be realized with the interfaces: C = A*B+C
- All cases where A(int8, any-size, row_major), B(int8, any-size, packed_b), C(int32, any-size, row_major)
- All cases where A(bf16, any-size, row_major), B(bf16, any-size, packed_b), C(float, any-size, row_major)