tilelang.language.kernel

Kernel launching language interface in TileLang.

Classes

FrameStack

A simple stack-like wrapper around a deque that provides

KernelLaunchFrame

KernelLaunchFrame is a custom TIRFrame that manages block/thread indices

Functions

Kernel(*blocks[, threads, cluster_dims, is_cpu, prelude])

Tools to quickly construct a GPU kernel launch frame.

get_thread_binding([dim])

Returns the thread binding for the given dimension.

get_thread_bindings()

Returns all three thread bindings.

get_block_binding([dim])

Returns the block binding for the given dimension.

get_block_bindings()

Returns all three block bindings.

get_thread_extent([dim])

Returns the thread extent for the given dimension.

get_thread_extents()

Returns all three thread extents.

get_block_extent([dim])

Returns the block extent for the given dimension.

get_block_extents()

Returns all three block extents.

Module Contents

class tilelang.language.kernel.FrameStack

A simple stack-like wrapper around a deque that provides push, pop, and top methods for convenience.

push(item)

Pushes an item onto the top of the stack.

pop()

Pops and returns the top of the stack, or returns None if the stack is empty.

top()

Returns the item on the top of the stack without removing it, or None if the stack is empty.

size()

Returns the number of items in the stack.

__len__()

Returns the number of items in the stack.

__bool__()

Allows truthy checks on the stack object itself, e.g., 'if stack: ...'

class tilelang.language.kernel.KernelLaunchFrame

Bases: tvm.script.ir_builder.tir.frame.TIRFrame

KernelLaunchFrame is a custom TIRFrame that manages block/thread indices and handles the entry and exit of the kernel launch scope.

__enter__()

Enters the KernelLaunchFrame scope and pushes this frame onto the stack. Returns one Var if we detect exactly 5 frames (meaning there is a single block dimension), or a list of Vars otherwise.

返回类型:

tvm.tir.Var | list[tvm.tir.Var]

__exit__(ptype, value, trace)

Exits the KernelLaunchFrame scope and pops this frame from the stack, but only if it's indeed the topmost frame.

classmethod Current()

Returns the topmost (current) KernelLaunchFrame from the stack if it exists, or None if the stack is empty.

返回类型:

KernelLaunchFrame | None

get_block_extent(dim)

Returns the block extent for the given dimension. dim=0 corresponds to blockIdx.x, dim=1 to blockIdx.y, and dim=2 to blockIdx.z.

参数:

dim (int)

返回类型:

int

get_block_extents()

Returns the block extents for all three dimensions.

返回类型:

list[int]

get_thread_extent(dim)

Returns the thread extent for the given dimension. dim=0 corresponds to threadIdx.x, dim=1 to threadIdx.y, and dim=2 to threadIdx.z.

参数:

dim (int)

返回类型:

int

get_thread_extents()

Returns the thread extents for all three dimensions.

返回类型:

list[int]

get_thread_binding(dim=0)

Returns the thread binding for the given dimension. dim=0 corresponds to threadIdx.x, dim=1 to threadIdx.y, and dim=2 to threadIdx.z.

参数:

dim (int)

返回类型:

tvm.tir.Var

get_thread_bindings()

Returns the thread binding for the given dimension. dim=0 corresponds to threadIdx.x, dim=1 to threadIdx.y, and dim=2 to threadIdx.z.

返回类型:

list[tvm.tir.Var]

get_num_threads()

Returns the thread indices from the topmost frame.

返回类型:

int

get_block_binding(dim=0)

Returns the block binding for the given dimension. dim=0 corresponds to blockIdx.x, dim=1 to blockIdx.y, and dim=2 to blockIdx.z.

参数:

dim (int)

返回类型:

tvm.tir.Var

get_block_bindings()

Returns all three block bindings.

返回类型:

list[tvm.tir.Var]

property blocks: list[tvm.tir.Var]

Returns the block indices from the topmost frame.

返回类型:

list[tvm.tir.Var]

property threads: list[tvm.tir.Var]

Returns the thread indices from the topmost frame.

返回类型:

list[tvm.tir.Var]

property num_threads: int

Returns the total number of threads.

返回类型:

int

tilelang.language.kernel.Kernel(*blocks, threads=None, cluster_dims=None, is_cpu=False, prelude=None)

Tools to quickly construct a GPU kernel launch frame.

参数:
  • blocks (int) -- A list of extent, can be 1-3 dimension, representing gridDim.(x|y|z)

  • threads (int) -- A integer representing blockDim.x Or a list of integers representing blockDim.(x|y|z) if the value is -1, we skip the threadIdx.x binding.

  • cluster_dims (int | tuple[int, int, int] | list[int] | None) -- The cluster dimensions for SM90+ cluster launch. For example, use 2 or (2, 1, 1) to create 2-CTA clusters. When specified, the kernel will be launched using cudaLaunchKernelEx with cudaLaunchAttributeClusterDimension.

  • is_cpu (bool) -- Whether the kernel is running on CPU. Thus we will not bind threadIdx.x, threadIdx.y, threadIdx.z. and blockIdx.x, blockIdx.y, blockIdx.z.

  • prelude (str) -- The import c code of the kernel, will be injected before the generated kernel code.

返回:

res -- The result LaunchThreadFrame.

返回类型:

Tuple[frame.LaunchThreadFrame]

示例

Create a 1-D CUDA kernel launch and unpack the single block index:

with T.Kernel(T.ceildiv(N, 128), threads=128) as bx:
    # bx is the blockIdx.x binding (also iterable as (bx,))
    ...

Launch a 2-D grid while requesting two thread dimensions:

with T.Kernel(grid_x, grid_y, threads=(64, 2)) as (bx, by):
    tx, ty = T.get_thread_bindings()
    ...

Emit a CPU kernel where thread bindings are skipped:

with T.Kernel(loop_extent, is_cpu=True) as (i,):
    ...
tilelang.language.kernel.get_thread_binding(dim=0)

Returns the thread binding for the given dimension.

参数:

dim (int)

返回类型:

tvm.tir.Var

tilelang.language.kernel.get_thread_bindings()

Returns all three thread bindings.

返回类型:

list[tvm.tir.Var]

tilelang.language.kernel.get_block_binding(dim=0)

Returns the block binding for the given dimension.

参数:

dim (int)

返回类型:

tvm.tir.Var

tilelang.language.kernel.get_block_bindings()

Returns all three block bindings.

返回类型:

list[tvm.tir.Var]

tilelang.language.kernel.get_thread_extent(dim=0)

Returns the thread extent for the given dimension.

参数:

dim (int)

返回类型:

int

tilelang.language.kernel.get_thread_extents()

Returns all three thread extents.

返回类型:

list[int]

tilelang.language.kernel.get_block_extent(dim=0)

Returns the block extent for the given dimension.

参数:

dim (int)

返回类型:

int

tilelang.language.kernel.get_block_extents()

Returns all three block extents.

返回类型:

list[int]