tilelang.quantize.utils

函式

gen_quant4(k, n[, groupsize])

general_compress(lowprecision_weight[, source_bits, ...])

interleave_weight(qweight[, nbits, target_dtype])

Interleave the weight to the target data type.

Module Contents

tilelang.quantize.utils.gen_quant4(k, n, groupsize=-1)
tilelang.quantize.utils.general_compress(lowprecision_weight, source_bits=4, storage_dtype=None)
tilelang.quantize.utils.interleave_weight(qweight, nbits=4, target_dtype='float16')

Interleave the weight to the target data type.

參數:
  • qweight (_type_) -- _description_

  • nbits (int, optional) -- _description_. Defaults to 4.

  • target_dtype (str, optional) -- _description_. Defaults to "float16".

回傳:

_description_

回傳型別:

_type_

範例

qweight = torch.randint(0, 127, (10, 10), dtype=torch.int8).cuda() interleave_weight(qweight, 4, "float16")