Almanac
technique

NF4

techniqueactivenf4-fb82b186·1 events·first seen 28d ago

Aliases: NF4

Co-occurring entities

More like this (12)

Recent events (1)

5Hugging Face Blog·28d ago·source ↗

Overview of Natively Supported Quantization Schemes in 🤗 Transformers

This Hugging Face blog post surveys the quantization methods natively integrated into the Transformers library as of September 2023, covering schemes such as GPTQ, bitsandbytes (LLM.int8, NF4), and related techniques. It explains how each method works, their trade-offs in terms of memory reduction and inference speed, and how practitioners can apply them via the Transformers API. The post serves as a practical reference for deploying large language models under memory constraints.