Almanac
dataset

Manga109-v2026

datasetactivemanga109-v2026-3ca5de5a·1 events·first seen 29d ago

Aliases: Manga109-v2026

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.CL·29d ago·source ↗

Manga109-v2026: Revised Benchmark Dataset for Manga OCR and Multimodal Understanding

Researchers revisit the widely-used Manga109 dataset and identify five categories of annotation issues including transcription errors, missing text regions, and under-segmented speech balloons. They construct Manga109-v2026 by combining OCR-based issue detection with manual revision, correcting approximately 29,000 dialogue annotations. The updated dataset is intended to better align with modern OCR and multimodal manga understanding systems while preserving manga-specific expressive structures.