Entity · model

LaBSE

modelactivelabse-afb1ea33·2 events·first seen May 19, 2026

Aliases: LaBSE

Co-occurring entities

Moral Foundations Theory Centered Kernel Alignment LLM-as-a-Judge Polish language M2M100 VecAlign NLLB Gemini-2.5-Flash-Lite Llama-Krikri-8B QLoRA AG-MG Parallel Corpus

More like this (12)

NLLB LabBench cuBLAS SpecBench LAVE FedLAB LCB SpatialBench Lambda Labs Barcelona Supercomputing Center Language Technologies MLE-bench SB Energy

Recent events (2)

4arXiv · cs.CL·May 22, 2026·source ↗

Moral Semantics Survive Machine Translation: Cross-Lingual Evidence from Moral Foundations Corpora

This paper investigates whether LLM-based machine translation can preserve moral semantic content well enough to enable cross-lingual moral values classification, using Polish as a test case with ~50k annotated social media posts. A four-method validation pipeline (LaBSE embedding similarity, CKA, LLM-as-judge, and classifier parity) shows mean cosine similarity of 0.86 and AUC gaps of only 0.01–0.02 across Moral Foundations categories. The results suggest machine translation is a practical path to extending moral values NLP research to under-resourced languages, with expected generalization to related Slavic languages.

Evaluation and Benchmarking Moral Foundations Theory Centered Kernel Alignment LLM-as-a-Judge +2 more

4arXiv · cs.CL·May 19, 2026·source ↗

Ancient Greek to Modern Greek Machine Translation: Novel Benchmark and Fine-Tuning Experiments

Researchers introduce the AG-MG Parallel Corpus, a 132,481 sentence-pair dataset for Ancient Greek to Modern Greek machine translation, created via a pipeline combining web scraping, VecAlign with LaBSE embeddings, and Gemini 2.5 Flash-based alignment correction. The paper benchmarks NMT models (NLLB, M2M100) and a Greek LLM (Llama-Krikri-8B) under three fine-tuning strategies. Full-parameter fine-tuning of Llama-Krikri-8B achieves the best BLEU score of 13.16, while QLoRA-adapted M2M100-1.2B shows the largest relative gains (+10.3 BLEU). This represents the first comprehensive MT benchmark for this low-resource language pair.

Evaluation and Benchmarking Open Weights Progress M2M100 VecAlign NLLB +5 more