tilelang.quantize.utils¶

函式¶

`gen_quant4`(k, n[, groupsize])
`general_compress`(lowprecision_weight[, source_bits, ...])
`interleave_weight`(qweight[, nbits, target_dtype])	Interleave the weight to the target data type.

tilelang.quantize.utils.general_compress(lowprecision_weight, source_bits=4, storage_dtype=None)¶

tilelang.quantize.utils.interleave_weight(qweight, nbits=4, target_dtype='float16')¶

Interleave the weight to the target data type.

參數:

回傳:

_description_

回傳值型別:

_type_

範例

qweight = torch.randint(0, 127, (10, 10), dtype=torch.int8).cuda() interleave_weight(qweight, 4, "float16")