Almanac
benchmark

3LM

benchmarkactive3lm-0fdbc222·1 events·first seen 28d ago

Aliases: 3LM

Co-occurring entities

More like this (12)

Recent events (1)

4Hugging Face Blog·28d ago·source ↗

3LM: A Benchmark for Arabic LLMs in STEM and Code

TII UAE has released 3LM, a benchmark designed to evaluate large language models on Arabic-language STEM and coding tasks. The benchmark addresses a gap in multilingual evaluation infrastructure, where Arabic has been underrepresented relative to English and other high-resource languages. It targets both general-purpose and Arabic-specialized LLMs to assess their capabilities in technical domains.