Almanac
← Events
4Hugging Face Blog·1mo ago

Accelerating Stable Diffusion Inference on Intel CPUs

This Hugging Face blog post details techniques for optimizing Stable Diffusion inference on Intel CPUs, likely covering quantization, operator fusion, and Intel-specific hardware acceleration libraries. The post addresses the practical challenge of running diffusion models on CPU hardware without dedicated GPUs. This is relevant to inference economics and enterprise deployment patterns where GPU availability is constrained.

Related guides (3)

Related events (8)

4Hugging Face Blog·1mo ago·source ↗

Optimizing Stable Diffusion for Intel CPUs with NNCF and Hugging Face Optimum

This Hugging Face blog post details techniques for optimizing Stable Diffusion inference on Intel CPUs using Neural Network Compression Framework (NNCF) and the Optimum library. The workflow covers quantization and other compression methods to reduce latency and memory footprint on CPU hardware. This is relevant to the inference-economics and enterprise-deployment threads as it addresses running diffusion models without dedicated GPU hardware.

4Hugging Face Blog·1mo ago·source ↗

Fine-tuning Stable Diffusion models on Intel CPUs

This Hugging Face blog post describes a workflow for fine-tuning Stable Diffusion image generation models on Intel CPUs rather than GPUs. It covers the tooling and optimizations required to make CPU-based diffusion model training practical, relevant to inference-economics and hardware diversification trends. The post targets practitioners looking to reduce dependency on GPU hardware for generative model fine-tuning.

4Hugging Face Blog·1mo ago·source ↗

Accelerating Stable Diffusion XL Inference with JAX on Cloud TPU v5e

Hugging Face published a technical blog post detailing how to accelerate Stable Diffusion XL inference using JAX on Google Cloud TPU v5e hardware. The post covers the integration of JAX-based diffusion pipelines with TPU v5e, demonstrating performance gains from hardware-software co-optimization. This represents a practical deployment pattern for large image generation models on non-GPU accelerators.

4Hugging Face Blog·1mo ago·source ↗

Faster Stable Diffusion with Core ML on iPhone, iPad, and Mac

Hugging Face published a blog post detailing optimizations for running Stable Diffusion models via Core ML on Apple devices including iPhone, iPad, and Mac. The post covers techniques to accelerate on-device inference using Apple's neural engine and Core ML framework. This represents progress in deploying capable diffusion models at the edge without cloud dependency.

5Hugging Face Blog·1mo ago·source ↗

Using Stable Diffusion with Core ML on Apple Silicon

Hugging Face published a guide on running Stable Diffusion models via Apple's Core ML framework on Apple Silicon hardware. The post covers converting diffusion model weights to Core ML format and integrating them into the Diffusers library for on-device inference. This represents an early effort to enable efficient local image generation on consumer Apple hardware without requiring cloud GPU resources.

4Hugging Face Blog·1mo ago·source ↗

Accelerate StarCoder with Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding

Hugging Face and Intel demonstrate quantization (INT8/INT4) and speculative decoding techniques applied to StarCoder on Intel Xeon CPUs using the Optimum Intel library. The post covers practical inference acceleration workflows targeting CPU deployment of code generation models. This represents a concrete inference-economics use case for open-weight code models on commodity server hardware.

4Hugging Face Blog·1mo ago·source ↗

Accelerating SD Turbo and SDXL Turbo Inference with ONNX Runtime and Olive

This Hugging Face blog post details how to accelerate Stable Diffusion Turbo and SDXL Turbo inference using ONNX Runtime and Microsoft's Olive optimization toolkit. The post covers the workflow for converting and optimizing diffusion models for faster deployment. This is a practical inference optimization guide targeting practitioners deploying image generation models.

5Hugging Face Blog·1mo ago·source ↗

Stable Diffusion XL on Mac with Advanced Core ML Quantization

Hugging Face details the process of running Stable Diffusion XL (SDXL) on Apple Silicon Macs using Core ML with advanced quantization techniques. The post covers how quantization reduces model size and memory requirements to make SDXL feasible on consumer Mac hardware. This represents a practical deployment advance for running large diffusion models at the edge on Apple devices.