quantization
quantization-c5b93eaa·4 events·first seen 1mo agoAliases: quantization
Co-occurring entities
More like this (12)
Recent events (4)
Stable Diffusion XL on Mac with Advanced Core ML Quantization
Hugging Face details the process of running Stable Diffusion XL (SDXL) on Apple Silicon Macs using Core ML with advanced quantization techniques. The post covers how quantization reduces model size and memory requirements to make SDXL feasible on consumer Mac hardware. This represents a practical deployment advance for running large diffusion models at the edge on Apple devices.
Bringing Robotics AI to Embedded Platforms: Dataset Recording, VLA Fine-Tuning, and On-Device Optimizations
NXP and Hugging Face describe a pipeline for deploying Vision-Language-Action (VLA) models on embedded/edge hardware, covering dataset recording, fine-tuning, and on-device optimization techniques. The post targets robotics applications where inference must run on resource-constrained microcontrollers or SoCs rather than cloud GPUs. Key topics include quantization, model compression, and integration with the LeRobot ecosystem. This represents a practical engineering bridge between frontier VLA research and real-world embedded robotics deployment.
What Parameter Golf taught us about AI-assisted research
OpenAI's Parameter Golf competition attracted over 1,000 participants and 2,000+ submissions focused on AI-assisted ML research under strict constraints. The challenge explored coding agents, quantization techniques, and novel model design within tight parameter budgets. The event served as a structured probe into how AI tools augment human researchers tackling constrained optimization problems.
Sentence-Level Clinical Provenance Categorization for Multidisciplinary Hospital Summarization Using Fine-Tuned Llama-3
This pilot study presents a pipeline for categorizing sentence-level clinical provenance across multi-source hospital notes, targeting structured summarization in high-complexity settings like the NICU. The authors fine-tune Llama-3 8B and 70B models on MedSecId (MIMIC-III annotations), achieving Macro F1 above 92% in-domain. Cross-domain evaluation reveals a scale-dependent transfer effect: SFT substantially improves the 70B model (+7% Macro F1) but yields only marginal gains for the 8B model. A quantized fine-tuned 70B model outperforms its full-precision baseline while reducing compute, suggesting quantized adaptation is viable for structured clinical NLP tasks.