Macro-F1
macro-f1-1552ca9c·3 events·first seen 25d agoAliases: Macro-F1, Macro F1
Co-occurring entities
More like this (12)
Recent events (3)
Image-Semantic Guided Detection of AI-Generated Modern Chinese Poetry Using MLLMs
This paper proposes a multimodal detection method for identifying AI-generated modern Chinese poetry by incorporating images that reflect poetic content alongside text. The approach uses example-driven prompting to integrate meaning, imagery, and emotional cues from images as a complement to textual analysis. A Gemini-based detector using this method achieves 85.65% Macro-F1, outperforming both plain-text LLM baselines and the traditional RoBERTa detector. The work extends AI-generated content detection research into a domain—modern Chinese poetry—previously unaddressed by prior studies.
Forgotten Words: Benchmarking NeoBERT for Dementia Detection in Low-Resource Conversational Filipino and English Speech
This paper presents the first NLP-based dementia detection study for Filipino speech, constructing a parallel bilingual dataset of 4,000 DementiaBank-derived transcripts with manual Filipino translations. Five model families are evaluated across monolingual, zero-shot cross-lingual, and bilingual fine-tuning settings. English-trained BERT degrades sharply on Filipino (Macro-F1 = 0.455), but bilingual fine-tuning recovers performance to Macro-F1 = 0.969–0.973 across all transformer models. The key finding is that multilingual clinical NLP performance is driven by linguistic coverage during training rather than model scale or architecture.
Sentence-Level Clinical Provenance Categorization for Multidisciplinary Hospital Summarization Using Fine-Tuned Llama-3
This pilot study presents a pipeline for categorizing sentence-level clinical provenance across multi-source hospital notes, targeting structured summarization in high-complexity settings like the NICU. The authors fine-tune Llama-3 8B and 70B models on MedSecId (MIMIC-III annotations), achieving Macro F1 above 92% in-domain. Cross-domain evaluation reveals a scale-dependent transfer effect: SFT substantially improves the 70B model (+7% Macro F1) but yields only marginal gains for the 8B model. A quantized fine-tuned 70B model outperforms its full-precision baseline while reducing compute, suggesting quantized adaptation is viable for structured clinical NLP tasks.