benchmark
Text Analytics Evaluation Framework
benchmarkactive
text-analytics-evaluation-framework-65e089f1·1 events·first seen 26d agoAliases: Text Analytics Evaluation Framework
Co-occurring entities
More like this (12)
AI Cybersecurity Threat Evaluation Frameworkwet lab biological research evaluation frameworkT-EvalOpenAI EvalsText Aphasia Battery (TAB)FLTEvalAdvanced AI Scaling FrameworkArtificial Analysis Text to Image LeaderboardEvaluation Cards: An Interpretive Layer for AI Evaluation ReportingAI-assisted human evaluationData Measurements ToolFrontier AI Framework
Recent events (1)
Text Analytics Evaluation Framework: Benchmarking LLMs on Social Media NLP Tasks
Researchers introduce a 470-question evaluation framework to assess LLM performance on aggregated social media text, applied to Twitter datasets across sentiment analysis, hate speech detection, and emotion recognition. Results show performance degrades substantially as input scale exceeds 500 instances, particularly for open-weights models on numerical tasks. Multi-label and target-dependent scenarios also show notable performance drops, and task complexity progressively erodes accuracy from basic semantic identification to comparison and counting operations. The findings point to architectural bottlenecks in current LLMs for rigorous quantitative analysis over large text collections.