Almanac
dataset

Manga109

datasetactivemanga109-c69e3d18·1 events·first seen 29d ago

Aliases: Manga109

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.CL·29d ago·source ↗

Manga109-v2026: Revised Benchmark Dataset for Manga OCR and Multimodal Understanding

Researchers revisit the widely-used Manga109 dataset and identify five categories of annotation issues including transcription errors, missing text regions, and under-segmented speech balloons. They construct Manga109-v2026 by combining OCR-based issue detection with manual revision, correcting approximately 29,000 dialogue annotations. The updated dataset is intended to better align with modern OCR and multimodal manga understanding systems while preserving manga-specific expressive structures.