tilelang.autotuner.param¶
The auto-tune parameters.
Attributes¶
類別¶
Compile arguments for the auto-tuner. Detailed description can be found in tilelang.jit.compile. |
|
Profile arguments for the auto-tuner. |
|
Results from auto-tuning process. |
Module Contents¶
- tilelang.autotuner.param.BEST_CONFIG_PATH = 'best_config.json'¶
- tilelang.autotuner.param.FUNCTION_PATH = 'function.pkl'¶
- tilelang.autotuner.param.LATENCY_PATH = 'latency.json'¶
- tilelang.autotuner.param.DEVICE_KERNEL_PATH = 'device_kernel.cu'¶
- tilelang.autotuner.param.HOST_KERNEL_PATH = 'host_kernel.cu'¶
- tilelang.autotuner.param.EXECUTABLE_PATH = 'executable.so'¶
- tilelang.autotuner.param.KERNEL_LIB_PATH = 'kernel_lib.so'¶
- tilelang.autotuner.param.KERNEL_CUBIN_PATH = 'kernel.cubin'¶
- tilelang.autotuner.param.KERNEL_PY_PATH = 'kernel.py'¶
- tilelang.autotuner.param.PARAMS_PATH = 'params.pkl'¶
- class tilelang.autotuner.param.CompileArgs¶
Compile arguments for the auto-tuner. Detailed description can be found in tilelang.jit.compile. .. attribute:: out_idx
List of output tensor indices.
- execution_backend¶
Execution backend to use for kernel execution (default: "auto").
- target¶
Compilation target, either as a string or a TVM Target object (default: "auto").
- target_host¶
Target host for cross-compilation (default: None).
- verbose¶
Whether to enable verbose output (default: False).
- pass_configs¶
Additional keyword arguments to pass to the Compiler PassContext.
- Refer to `tilelang.PassConfigKey` for supported options.
- out_idx: list[int] | int | None = None¶
- execution_backend: Literal['auto', 'tvm_ffi', 'cython', 'nvrtc', 'torch'] = 'auto'¶
- target: Literal['auto', 'cuda', 'hip'] = 'auto'¶
- target_host: str | tvm.target.Target = None¶
- pass_configs: dict[str, Any] | None = None¶
- compile_program(program)¶
- 參數:
program (tvm.tir.PrimFunc)
- __hash__()¶
- class tilelang.autotuner.param.ProfileArgs¶
Profile arguments for the auto-tuner.
- warmup¶
Number of warmup iterations.
- rep¶
Number of repetitions for timing.
- timeout¶
Maximum time per configuration.
- backend¶
Profiler backend - "event" (CUDA events), "cupti", or "cudagraph".
- supply_type¶
Type of tensor supply mechanism.
- ref_prog¶
Reference program for correctness validation.
- supply_prog¶
Supply program for input tensors.
- out_idx¶
Union[List[int], int] = -1
- supply_type¶
tilelang.TensorSupplyType = tilelang.TensorSupplyType.Auto
- ref_prog¶
Callable = None
- supply_prog¶
Callable = None
- rtol¶
float = 1e-2
- atol¶
float = 1e-2
- max_mismatched_ratio¶
float = 0.01
- skip_check¶
bool = False
- manual_check_prog¶
Callable = None
- cache_input_tensors¶
bool = True
- warmup: int = 25¶
- rep: int = 100¶
- timeout: int = 30¶
- backend: Literal['event', 'cupti', 'cudagraph'] = 'event'¶
- supply_type: tilelang.TensorSupplyType¶
- ref_prog: Callable = None¶
- supply_prog: Callable = None¶
- rtol: float = 0.01¶
- atol: float = 0.01¶
- max_mismatched_ratio: float = 0.01¶
- manual_check_prog: Callable = None¶
- __hash__()¶
- class tilelang.autotuner.param.AutotuneResult¶
Results from auto-tuning process.
- latency¶
Best achieved execution latency.
- config¶
Configuration that produced the best result.
- ref_latency¶
Reference implementation latency.
- libcode¶
Generated library code.
- func¶
Optimized function.
- kernel¶
Compiled kernel function.
- latency: float | None = None¶
- config: dict | None = None¶
- ref_latency: float | None = None¶
- libcode: str | None = None¶
- func: Callable | None = None¶
- kernel: Callable | None = None¶
- classmethod load_from_disk(path, compile_args)¶
- 參數:
path (pathlib.Path)
compile_args (CompileArgs)
- 回傳型別: