tilelang.quantize.utils¶
函式¶
|
|
|
|
|
Interleave the weight to the target data type. |
Module Contents¶
- tilelang.quantize.utils.gen_quant4(k, n, groupsize=-1)¶
- tilelang.quantize.utils.general_compress(lowprecision_weight, source_bits=4, storage_dtype=None)¶
- tilelang.quantize.utils.interleave_weight(qweight, nbits=4, target_dtype='float16')¶
Interleave the weight to the target data type.
- 參數:
qweight (_type_) -- _description_
nbits (int, optional) -- _description_. Defaults to 4.
target_dtype (str, optional) -- _description_. Defaults to "float16".
- 回傳:
_description_
- 回傳型別:
_type_
範例
qweight = torch.randint(0, 127, (10, 10), dtype=torch.int8).cuda() interleave_weight(qweight, 4, "float16")