technique
GaLore
techniqueactive
galore-615f3ae9·1 events·first seen 28d agoAliases: GaLore
Co-occurring entities
More like this (12)
Recent events (1)
GaLore: Advancing Large Model Training on Consumer-grade Hardware
GaLore (Gradient Low-Rank Projection) is a memory-efficient training technique that reduces optimizer state memory by projecting gradients into a low-rank subspace during training, enabling large model training on consumer-grade hardware. The Hugging Face blog post covers integration of GaLore into the transformers and peft ecosystems. Unlike LoRA, GaLore applies low-rank projection to the full training process rather than constraining weight updates, allowing full-parameter learning with reduced memory footprint. This makes training models like LLaMA-7B feasible on single consumer GPUs.