Entity · technique

ChunkFT

techniqueactivechunkft-28d0ae31·1 events·first seen May 21, 2026

Aliases: ChunkFT

Co-occurring entities

Llama 3.1 70B MT-Bench Meta AI RTX 4090 H800 Llama-3.1-8B

More like this (12)

MemFT SFT Action Chunking Transformer UltraFedFM FastRTC DFlash Heaptrack PEFT Target-SFT fastText FTX ByteDance

Recent events (1)

6arXiv · cs.CL·May 21, 2026·source ↗

ChunkFT: Memory-Efficient Full Fine-Tuning via Byte-Streamed Chunk Optimization

ChunkFT is a fine-tuning framework that reformulates full-parameter optimization around a dynamically activated working set of sub-tensors, enabling gradient computation without dense gradient materialization. It achieves full-parameter fine-tuning of a 7B model in 13.72GB GPU memory on a single RTX 4090, and scales Llama 3-70B fine-tuning to 2×H800 GPUs. Downstream evaluations on language understanding, math reasoning, and MT-Bench show ChunkFT matches or exceeds full-parameter fine-tuning quality while outperforming existing memory-efficient baselines such as LoRA-class methods. A theoretical convergence analysis in the deterministic setting is also provided.

Training Infrastructure Open Weights Progress Llama 3.1 70B MT-Bench Meta AI +5 more