Entity · technique

Self-Ensembling VLM Chart Extraction

techniqueactiveself-ensembling-vlm-chart-extraction-0e49e8bb·1 events·first seen May 27, 2026

Aliases: Self-Ensembling VLM Chart Extraction

Co-occurring entities

More like this (12)

FakeVLM The Count Is There, but Misaligned: Understanding and Correcting Counting Failures in VLMs Neuron-Aware Data Selection for Annotation-Free LLM Self-Distillation Gaze Heads: How VLMs Look at What They Describe gender bias in VLMs When Good Verifiers Go Bad: Self-Improving VLMs Can Regress on New Tasks WB-ChartExtract FORCE: Efficient VLA Reinforcement Fine-Tuning via Value-Calibrated Warm-up and Self-Distillation SmolVLM Where Do Models Find Happiness? Emotion Vectors in Open-Source LLMs Test-Time Scaling for Small VLMs on Multilingual Visual MCQ Forecasting With LLMs: Improved Generalization Through Feature Steering

Recent events (1)

5arXiv · cs.CL·May 27, 2026·source ↗

Self-Ensembling Vision-Language Models for Chart Data Extraction

This paper proposes a self-ensembling method for chart-to-table extraction using vision-language models (VLMs), where multiple tabular outputs are sampled from the same VLM for a given chart image and aggregated via per-cell median over numerical values. The approach includes convergence detection and uncertainty estimation based on sample dispersion. The authors also introduce WB-ChartExtract, a new benchmark built from World Bank data featuring charts with ~7x more datapoints than ChartQA. The method achieves up to 23% relative improvement on WB-ChartExtract over single-pass VLM baselines.

Evaluation and Benchmarking Multimodal Progress WB-ChartExtract ChartQA World Bank +1 more