logbquant-066a0ee7·1 events·first seen Aliases: LogbQuant
A new arXiv preprint introduces LogbQuant, a logarithmic quantization scheme with tunable bases designed to better capture common weight distributions in language models. The method targets the known weakness of uniform quantization in handling low-frequency, high-magnitude weights. At 4-bit precision, LogbQuant claims superior performance over asymmetric linear quantization at tensor-wise granularity, with moderate speedup and high memory savings suitable for consumer-grade GPU deployment.