qualcomm-ai-research-2128489d·1 events·first seen Aliases: Qualcomm AI Research
Qualcomm AI Research introduces BamiBERT, a BERT-based encoder pre-trained from scratch on 129GB of Vietnamese text for 20 epochs, supporting up to 2048-token context without requiring external word segmentation. It outperforms PhoBERT, the previous de facto Vietnamese encoder, achieving best scores on 11 of 15 metrics across 8 Vietnamese benchmarks. The model is released publicly on Hugging Face.