tilelang.intrinsics.utils ========================= .. py:module:: tilelang.intrinsics.utils Functions --------- .. autoapisummary:: tilelang.intrinsics.utils.get_ldmatrix_offset tilelang.intrinsics.utils.shared_16x16_to_mma_32x8_layout tilelang.intrinsics.utils.shared_16x32_to_mma_32x16_layout tilelang.intrinsics.utils.shared_32x16_to_mma_32x16_layout tilelang.intrinsics.utils.mma_store_index_map tilelang.intrinsics.utils.mma_store_index_map_fp64 tilelang.intrinsics.utils.mfma_store_index_map tilelang.intrinsics.utils.get_mma_micro_size Module Contents --------------- .. py:function:: get_ldmatrix_offset(matrix, row_idx, col_idx, stride, dtype = 'float16', transposed = False) .. py:function:: shared_16x16_to_mma_32x8_layout(i, j) .. py:function:: shared_16x32_to_mma_32x16_layout(i, j) .. py:function:: shared_32x16_to_mma_32x16_layout(i, j) .. py:function:: mma_store_index_map(thread_id, local_id) .. py:function:: mma_store_index_map_fp64(thread_id, local_id) .. py:function:: mfma_store_index_map(thread_id, local_id) .. py:function:: get_mma_micro_size(dtype) Return the MMA (Tensor Core) micro-tile dimensions for a given data type. This function returns the micro tile sizes (x, y, k) used by MMA/Tensor Core operations. - x: tile width in the output/result dimension - y: tile height in the output/result dimension - k: tile depth in the reduction/K dimension Accepted dtype strings include "float16", "int8" and some FP8 identifiers ("float8_e4m3", "float8_e5m2"). For FP8 and int8 types the reduction depth (`k`) is 32; for float16 it is 16. :returns: (micro_size_x, micro_size_y, micro_size_k) :rtype: tuple[int, int, int]