Almanac
organization

OpenDataLab

organizationactiveprovisionalopendatalab-806eb030·1 events·first seen 18d ago

Aliases: OpenDataLab

Co-occurring entities

More like this (12)

Recent events (1)

4Github Trending·18d ago·source ↗

MinerU: Document-to-LLM-Ready Markdown/JSON Conversion Tool

MinerU is an open-source Python tool by OpenDataLab that converts complex documents (PDFs, Office files) into structured markdown or JSON formats optimized for LLM and agentic workflows. The repository has accumulated 65,610 GitHub stars with 180 new stars today, indicating sustained community traction. It targets a common preprocessing bottleneck in RAG and agent pipelines.