tilelang.carver.arch.driver.cuda_driver¶

Classes¶

Functions¶

get_cuda_device_properties([device_id])

get_device_name([device_id])

get_shared_memory_per_block([device_id, format])

get_device_attribute(attr[, device_id])

get_max_dynamic_shared_size_bytes([device_id, format])

Get the maximum dynamic shared memory size in bytes, kilobytes, or megabytes.

get_persisting_l2_cache_max_size([device_id])

get_num_sms([device_id])

Get the number of streaming multiprocessors (SMs) on the CUDA device.

get_registers_per_block([device_id])

Get the maximum number of 32-bit registers available per block.

Module Contents¶

class tilelang.carver.arch.driver.cuda_driver.cudaDeviceAttrNames¶

refer to https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g49e2f8c2c0bd6fe264f2fc970912e5cd

cudaDevAttrMaxThreadsPerBlock: int = 1¶
cudaDevAttrMaxRegistersPerBlock: int = 12¶
cudaDevAttrMaxSharedMemoryPerMultiprocessor: int = 81¶
cudaDevAttrMaxPersistingL2CacheSize: int = 108¶
tilelang.carver.arch.driver.cuda_driver.get_cuda_device_properties(device_id=0)¶
Parameters:

device_id (int)

Return type:

torch.cuda._CudaDeviceProperties | None

tilelang.carver.arch.driver.cuda_driver.get_device_name(device_id=0)¶
Parameters:

device_id (int)

Return type:

str | None

tilelang.carver.arch.driver.cuda_driver.get_shared_memory_per_block(device_id=0, format='bytes')¶
Parameters:
  • device_id (int)

  • format (str)

Return type:

int | None

tilelang.carver.arch.driver.cuda_driver.get_device_attribute(attr, device_id=0)¶
Parameters:
  • attr (int)

  • device_id (int)

Return type:

int

tilelang.carver.arch.driver.cuda_driver.get_max_dynamic_shared_size_bytes(device_id=0, format='bytes')¶

Get the maximum dynamic shared memory size in bytes, kilobytes, or megabytes.

Parameters:
  • device_id (int)

  • format (str)

Return type:

int | None

tilelang.carver.arch.driver.cuda_driver.get_persisting_l2_cache_max_size(device_id=0)¶
Parameters:

device_id (int)

Return type:

int

tilelang.carver.arch.driver.cuda_driver.get_num_sms(device_id=0)¶

Get the number of streaming multiprocessors (SMs) on the CUDA device.

Parameters:

device_id (int, optional) – The CUDA device ID. Defaults to 0.

Returns:

The number of SMs on the device.

Return type:

int

Raises:

RuntimeError – If unable to get the device properties.

tilelang.carver.arch.driver.cuda_driver.get_registers_per_block(device_id=0)¶

Get the maximum number of 32-bit registers available per block.

Parameters:

device_id (int)

Return type:

int