Entity · benchmark

chrF

benchmarkactivechrf-8259704e·4 events·first seen Jun 5, 2026

Aliases: chrF, ChrF++, CHrF++

Co-occurring entities

Biomedical Machine Translation for Low-Resource Arabic-Script Languages via Cross-Lingual Transfer and LoRA Adapter Merging LoRA A Factorial Study of Synthetic Data Generation for Low-Resource Machine Translation using Grammar Books TIES-Merging DeltaMerge-LowRes Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

More like this (12)

BERT-F1 BCR RxR C-RASP CFR FACTR 2 CoRP DeltaCR hh-rlhf CRISPR fastRAG GCG

Recent events (4)

4arXiv · cs.CL·4d ago·source ↗

Cross-lingual transfer and LoRA adapter merging for biomedical NMT in low-resource Arabic-script languages

Researchers present a systematic study of healthcare-domain neural machine translation for four severely low-resource Arabic-script languages (Dari, Pashto, Sorani Kurdish, Urdu) using Arabic and Persian as higher-resource pivots. Three transfer strategies are evaluated: few-shot in-context learning, minimal supervised adaptation, and zero-data LoRA adapter merging — the last being novel in this setting. Supervised adaptation with just 500 sentences achieves near-pivot quality for Dari, while LoRA adapter merging reaches within 3.5 CHrF++ of supervised adaptation at zero additional data cost. Pashto and Sorani Kurdish remain below clinical deployment thresholds, exposing limits of cross-lingual transfer when structural distance is large.

Biomedical Machine Translation for Low-Resource Arabic-Script Languages via Cross-Lingual Transfer and LoRA Adapter Merging LoRA chrF

4arXiv · cs.CL·4d ago·source ↗

LLM pipeline extracts grammar books to generate synthetic MT data for endangered languages

Researchers introduce a pipeline that uses LLMs to extract grammatical rules, example sentences, and lexicons from descriptive grammar books, then generates synthetic parallel corpora for fine-tuning machine translation models. Validated on three typologically diverse low-resource languages (Kalamang, Tuatschin, Mandan), fine-tuning on synthetic data outperforms seed-data baselines in 59–75% of configurations, with best-case ChrF++ gains up to +8.8. A systematic factorial study across 96 configurations identifies which combinations of target part-of-speech, retrieval granularity, and sample volume drive improvements.

A Factorial Study of Synthetic Data Generation for Low-Resource Machine Translation using Grammar Books chrF

5arXiv · cs.CL·Jul 16, 2026·source ↗

DeltaMerge-LowRes: Composing Language and Task Weight Deltas for Low-Resource NLP Adaptation

A new arXiv preprint introduces DeltaMerge-LowRes, a method for adapting multilingual encoders to new languages and tasks in low-resource settings by training language and task deltas separately and composing them in weight space at inference. The paper proposes four composition rules including a novel cross-axis TIES variant that adapts the TIES-Merging algorithm to language/task axes rather than task/task axes. Evaluated across four task families and four African languages (158 cells), cross-axis TIES improves summarization by +4 to +7 chrF and QA F1 by +2.32, while sparsity-aware merging reduces calibration error by 36%. The work advances model merging techniques for low-resource multilingual NLP.

Evaluation and Benchmarking Open Weights Progress TIES-Merging chrF DeltaMerge-LowRes

5arXiv · cs.CL·Jun 5, 2026·source ↗

Reinforcement learning enables meta-skill for translating unseen low-resource languages via in-context linguistic knowledge

Researchers propose an RL-based training approach for translating extremely low-resource or unseen languages by rewarding models for extracting and applying in-context linguistic knowledge (e.g., grammar books) rather than memorizing specific languages. Using chrF as a surface-level reward signal, RL-trained models outperform both in-context learning and supervised fine-tuning on completely unseen languages at test time. The work extends outcome-based RL beyond math and coding reasoning tasks, suggesting broader applicability to language learning from context.

Evaluation and Benchmarking Alignment and RLHF Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation chrF