Entity · technique

QLoRA

techniqueactiveqlora-4a9e3636·6 events·first seen May 19, 2026

Aliases: QLoRA

Co-occurring entities

More like this (12)

LoRA κ-LoRA RLOO Late-Stage LoRA MaLoRA MoE²-LoRA RL² Multi-LoRA serving Doc-to-LoRA Localized LoRA-MoE TailLoR OLMo

Recent events (6)

4arXiv · cs.CL·Jul 13, 2026·source ↗

Real-time sentence-level sign language translation system using SHuBERT-ByT5 with streaming hardware architecture

Researchers present a sentence-level sign language translation (SLT) system fine-tuned on How2Sign using QLoRA on a SHuBERT-ByT5 stack, achieving BLEU 15.9 and BLEURT 44.7 on the test split. The primary contribution is a hardware-aware streaming architecture that distributes camera capture to a Raspberry Pi 4B client while offloading perception and translation to a CPU/GPU backend. Latency optimizations including chunked ingestion, parallelized perception, and a sentence-boundary state machine reduce mean post-finalization response latency by 27.7% to 1.354 seconds. The system is designed for real-time deployment across diverse client devices rather than proposing a novel translation architecture.

Inference Economics Multimodal Progress BLEU How2Sign QLoRA +2 more

3arXiv · cs.CL·Jul 1, 2026·source ↗

Cross-lingual relation extraction for Romanian: QLoRA fine-tuning narrows gap but small encoders remain competitive

Researchers evaluate cross-lingual relation extraction for Romanian by translating the SemEval-2010 Task 8 benchmark and testing Gemma 4 31B under zero-shot, few-shot, and QLoRA fine-tuned settings against encoder baselines (XLM-RoBERTa, Romanian BERT, RoBERTa-large). QLoRA fine-tuning improves macro F1 by over 22 percentage points and reduces the English-Romanian cross-lingual gap from 3.3 to 1.4 pp, but encoder baselines of 125M–560M parameters come within 1–4 pp of the fine-tuned 31B model. The study concludes that large LLMs offer limited advantage over compact encoders for single-task relation extraction in compute-constrained deployment scenarios, and releases the translated dataset, code, and models.

Evaluation and Benchmarking Open Weights Progress Romanian BERT QLoRA XLM-RoBERTa +2 more

4arXiv · cs.CL·Jun 9, 2026·source ↗

Corpus-Grounded Feature Diffusion pipeline for automated IEP generation in Traditional Chinese

Researchers propose a low-resource fine-tuning pipeline called Corpus-Grounded Feature Diffusion (CGFD) to automate Individualized Education Program (IEP) drafting from Traditional Chinese parent-teacher interview transcripts. The approach fine-tunes Breeze-7B with QLoRA on 582 synthetically diffused samples and uses schema-constrained decoding at inference time, finding that Grammar-Constrained Decoding is counterproductive under Traditional Chinese token budgets. On a small formal hold-out (n=10), the system achieves BERTScore F1 of 0.779, outperforming zero-shot GPT-5.4, DeepSeek-V3.2, Gemini-3-Flash-Preview, and Llama-4-Maverick baselines while enabling fully local, air-gapped inference. The work addresses a gap in Traditional Chinese special-education NLP and demonstrates a privacy-preserving deployment pattern for sensitive document generation.

Evaluation and Benchmarking Enterprise Deployment Patterns DeepSeek V4 Corpus-Grounded Feature Diffusion Grammar-Constrained Decoding +6 more

6Hugging Face Blog·May 19, 2026·source ↗

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

Hugging Face published a blog post detailing the integration of 4-bit quantization via bitsandbytes into the Transformers library, enabling large language models to run on consumer-grade hardware. The post covers NF4 (NormalFloat4) data type and double quantization techniques from the QLoRA paper, which together reduce memory footprint significantly while preserving model quality. It demonstrates how users can load models like LLaMA in 4-bit precision and fine-tune them using QLoRA with minimal code changes.

Open Weights Progress Inference Economics Transformers NF4 (NormalFloat4)QLoRA +4 more

5Hugging Face Blog·May 19, 2026·source ↗

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

This Hugging Face blog post covers techniques for fine-tuning the FLUX.1-dev image generation model using LoRA (Low-Rank Adaptation) on consumer-grade hardware. The post likely addresses quantization strategies (QLoRA) to reduce memory requirements, enabling training on GPUs with limited VRAM. This is relevant to the open-weights and accessible fine-tuning ecosystem for diffusion models.

Open Weights Progress Inference Economics Black Forest Labs FLUX.1-dev LoRA +3 more

4arXiv · cs.CL·May 19, 2026·source ↗

Ancient Greek to Modern Greek Machine Translation: Novel Benchmark and Fine-Tuning Experiments

Researchers introduce the AG-MG Parallel Corpus, a 132,481 sentence-pair dataset for Ancient Greek to Modern Greek machine translation, created via a pipeline combining web scraping, VecAlign with LaBSE embeddings, and Gemini 2.5 Flash-based alignment correction. The paper benchmarks NMT models (NLLB, M2M100) and a Greek LLM (Llama-Krikri-8B) under three fine-tuning strategies. Full-parameter fine-tuning of Llama-Krikri-8B achieves the best BLEU score of 13.16, while QLoRA-adapted M2M100-1.2B shows the largest relative gains (+10.3 BLEU). This represents the first comprehensive MT benchmark for this low-resource language pair.

Evaluation and Benchmarking Open Weights Progress M2M100 VecAlign NLLB +5 more