tilelang.carver.arch.driver.cuda_driver

Classes

Functions

get_cuda_device_properties([device_id])

get_device_name([device_id])

get_shared_memory_per_block([device_id, format])

get_device_attribute(attr[, device_id])

get_max_dynamic_shared_size_bytes([device_id, format])

Get the maximum dynamic shared memory size in bytes, kilobytes, or megabytes.

get_persisting_l2_cache_max_size([device_id])

get_num_sms([device_id])

Get the number of streaming multiprocessors (SMs) on the CUDA device.

get_registers_per_block([device_id])

Get the maximum number of 32-bit registers available per block.

Module Contents

class tilelang.carver.arch.driver.cuda_driver.cudaDeviceAttrNames

refer to https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g49e2f8c2c0bd6fe264f2fc970912e5cd

cudaDevAttrMaxThreadsPerBlock: int = 1
cudaDevAttrMaxRegistersPerBlock: int = 12
cudaDevAttrMaxSharedMemoryPerMultiprocessor: int = 81
cudaDevAttrMaxPersistingL2CacheSize: int = 108
tilelang.carver.arch.driver.cuda_driver.get_cuda_device_properties(device_id=0)
参数:

device_id (int)

返回类型:

torch.cuda._CudaDeviceProperties | None

tilelang.carver.arch.driver.cuda_driver.get_device_name(device_id=0)
参数:

device_id (int)

返回类型:

str | None

tilelang.carver.arch.driver.cuda_driver.get_shared_memory_per_block(device_id=0, format='bytes')
参数:
  • device_id (int)

  • format (str)

返回类型:

int | None

tilelang.carver.arch.driver.cuda_driver.get_device_attribute(attr, device_id=0)
参数:
  • attr (int)

  • device_id (int)

返回类型:

int

tilelang.carver.arch.driver.cuda_driver.get_max_dynamic_shared_size_bytes(device_id=0, format='bytes')

Get the maximum dynamic shared memory size in bytes, kilobytes, or megabytes.

参数:
  • device_id (int)

  • format (str)

返回类型:

int | None

tilelang.carver.arch.driver.cuda_driver.get_persisting_l2_cache_max_size(device_id=0)
参数:

device_id (int)

返回类型:

int

tilelang.carver.arch.driver.cuda_driver.get_num_sms(device_id=0)

Get the number of streaming multiprocessors (SMs) on the CUDA device.

参数:

device_id (int, optional) -- The CUDA device ID. Defaults to 0.

返回:

The number of SMs on the device.

返回类型:

int

抛出:

RuntimeError -- If unable to get the device properties.

tilelang.carver.arch.driver.cuda_driver.get_registers_per_block(device_id=0)

Get the maximum number of 32-bit registers available per block.

参数:

device_id (int)

返回类型:

int