Almanac
product

Q8-Chat

productactiveq8-chat-cd1e292a·1 events·first seen 28d ago

Aliases: Q8-Chat

Co-occurring entities

More like this (12)

Recent events (1)

5Hugging Face Blog·28d ago·source ↗

Q8-Chat: Efficient Generative AI on Intel Xeon via INT8 Quantization

Hugging Face and Intel demonstrate running quantized large language models (INT8/Q8) on Intel Xeon CPUs, branded as Q8-Chat. The post covers inference performance of quantized models on CPU hardware without requiring GPUs. This is relevant to inference economics and enterprise deployment, particularly for organizations without GPU infrastructure.