tilelang.language.kernel

Kernel launching language interface in TileLang.

類別

FrameStack

A simple stack-like wrapper around a deque that provides

KernelLaunchFrame

KernelLaunchFrame is a custom TIRFrame that manages block/thread indices

函式

Kernel(*blocks[, threads, cluster_dims, is_cpu, prelude])

Tools to quickly construct a GPU kernel launch frame.

get_thread_binding([dim])

Returns the thread binding for the given dimension.

get_thread_bindings()

Returns all three thread bindings.

get_block_binding([dim])

Returns the block binding for the given dimension.

get_block_bindings()

Returns all three block bindings.

get_thread_extent([dim])

Returns the thread extent for the given dimension.

get_thread_extents()

Returns all three thread extents.

get_block_extent([dim])

Returns the block extent for the given dimension.

get_block_extents()

Returns all three block extents.

Module Contents

class tilelang.language.kernel.FrameStack

A simple stack-like wrapper around a deque that provides push, pop, and top methods for convenience.

push(item)

Pushes an item onto the top of the stack.

pop()

Pops and returns the top of the stack, or returns None if the stack is empty.

top()

Returns the item on the top of the stack without removing it, or None if the stack is empty.

size()

Returns the number of items in the stack.

__len__()

Returns the number of items in the stack.

__bool__()

Allows truthy checks on the stack object itself, e.g., 'if stack: ...'

class tilelang.language.kernel.KernelLaunchFrame

Bases: tvm.script.ir_builder.tir.frame.TIRFrame

KernelLaunchFrame is a custom TIRFrame that manages block/thread indices and handles the entry and exit of the kernel launch scope.

__enter__()

Enters the KernelLaunchFrame scope and pushes this frame onto the stack. Returns one Var if we detect exactly 5 frames (meaning there is a single block dimension), or a list of Vars otherwise.

回傳型別:

tvm.tir.Var | list[tvm.tir.Var]

__exit__(ptype, value, trace)

Exits the KernelLaunchFrame scope and pops this frame from the stack, but only if it's indeed the topmost frame.

classmethod Current()

Returns the topmost (current) KernelLaunchFrame from the stack if it exists, or None if the stack is empty.

回傳型別:

KernelLaunchFrame | None

get_block_extent(dim)

Returns the block extent for the given dimension. dim=0 corresponds to blockIdx.x, dim=1 to blockIdx.y, and dim=2 to blockIdx.z.

參數:

dim (int)

回傳型別:

int

get_block_extents()

Returns the block extents for all three dimensions.

回傳型別:

list[int]

get_thread_extent(dim)

Returns the thread extent for the given dimension. dim=0 corresponds to threadIdx.x, dim=1 to threadIdx.y, and dim=2 to threadIdx.z.

參數:

dim (int)

回傳型別:

int

get_thread_extents()

Returns the thread extents for all three dimensions.

回傳型別:

list[int]

get_thread_binding(dim=0)

Returns the thread binding for the given dimension. dim=0 corresponds to threadIdx.x, dim=1 to threadIdx.y, and dim=2 to threadIdx.z.

參數:

dim (int)

回傳型別:

tvm.tir.Var

get_thread_bindings()

Returns the thread binding for the given dimension. dim=0 corresponds to threadIdx.x, dim=1 to threadIdx.y, and dim=2 to threadIdx.z.

回傳型別:

list[tvm.tir.Var]

get_num_threads()

Returns the thread indices from the topmost frame.

回傳型別:

int

get_block_binding(dim=0)

Returns the block binding for the given dimension. dim=0 corresponds to blockIdx.x, dim=1 to blockIdx.y, and dim=2 to blockIdx.z.

參數:

dim (int)

回傳型別:

tvm.tir.Var

get_block_bindings()

Returns all three block bindings.

回傳型別:

list[tvm.tir.Var]

property blocks: list[tvm.tir.Var]

Returns the block indices from the topmost frame.

回傳型別:

list[tvm.tir.Var]

property threads: list[tvm.tir.Var]

Returns the thread indices from the topmost frame.

回傳型別:

list[tvm.tir.Var]

property num_threads: int

Returns the total number of threads.

回傳型別:

int

tilelang.language.kernel.Kernel(*blocks, threads=None, cluster_dims=None, is_cpu=False, prelude=None)

Tools to quickly construct a GPU kernel launch frame.

參數:
  • blocks (int) -- A list of extent, can be 1-3 dimension, representing gridDim.(x|y|z)

  • threads (int) -- A integer representing blockDim.x Or a list of integers representing blockDim.(x|y|z) if the value is -1, we skip the threadIdx.x binding.

  • cluster_dims (int | tuple[int, int, int] | list[int] | None) -- The cluster dimensions for SM90+ cluster launch. For example, use 2 or (2, 1, 1) to create 2-CTA clusters. When specified, the kernel will be launched using cudaLaunchKernelEx with cudaLaunchAttributeClusterDimension.

  • is_cpu (bool) -- Whether the kernel is running on CPU. Thus we will not bind threadIdx.x, threadIdx.y, threadIdx.z. and blockIdx.x, blockIdx.y, blockIdx.z.

  • prelude (str) -- The import c code of the kernel, will be injected before the generated kernel code.

回傳:

res -- The result LaunchThreadFrame.

回傳型別:

Tuple[frame.LaunchThreadFrame]

範例

Create a 1-D CUDA kernel launch and unpack the single block index:

with T.Kernel(T.ceildiv(N, 128), threads=128) as bx:
    # bx is the blockIdx.x binding (also iterable as (bx,))
    ...

Launch a 2-D grid while requesting two thread dimensions:

with T.Kernel(grid_x, grid_y, threads=(64, 2)) as (bx, by):
    tx, ty = T.get_thread_bindings()
    ...

Emit a CPU kernel where thread bindings are skipped:

with T.Kernel(loop_extent, is_cpu=True) as (i,):
    ...
tilelang.language.kernel.get_thread_binding(dim=0)

Returns the thread binding for the given dimension.

參數:

dim (int)

回傳型別:

tvm.tir.Var

tilelang.language.kernel.get_thread_bindings()

Returns all three thread bindings.

回傳型別:

list[tvm.tir.Var]

tilelang.language.kernel.get_block_binding(dim=0)

Returns the block binding for the given dimension.

參數:

dim (int)

回傳型別:

tvm.tir.Var

tilelang.language.kernel.get_block_bindings()

Returns all three block bindings.

回傳型別:

list[tvm.tir.Var]

tilelang.language.kernel.get_thread_extent(dim=0)

Returns the thread extent for the given dimension.

參數:

dim (int)

回傳型別:

int

tilelang.language.kernel.get_thread_extents()

Returns all three thread extents.

回傳型別:

list[int]

tilelang.language.kernel.get_block_extent(dim=0)

Returns the block extent for the given dimension.

參數:

dim (int)

回傳型別:

int

tilelang.language.kernel.get_block_extents()

Returns all three block extents.

回傳型別:

list[int]