4GitHub Trending (AI/LLM filtered)·22d ago

PaddleOCR: OCR Toolkit Bridging Documents and LLMs

PaddleOCR is an open-source OCR toolkit built on PaddlePaddle that converts PDFs and images into structured data suitable for LLM pipelines. It supports 100+ languages and is positioned as a document-to-AI bridge. The repository has accumulated nearly 79,000 GitHub stars, with 148 new stars today, indicating sustained community interest.

Enterprise Deployment Patterns Agent and Tool Ecosystem PaddlePaddle Python PaddleOCR

Related guides (2)

Enterprise Deployment PatternsTopic guide

Enterprise Deployment Patterns: From AI Demo to Production Reality

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

Read asBeginner In-depth

Related events (8)

4Hugging Face Blog·1mo ago·source ↗

PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend

PaddleOCR 3.5 introduces support for running OCR and document parsing pipelines using a Hugging Face Transformers backend, enabling integration with the broader Transformers ecosystem. The update allows users to leverage transformer-based models for optical character recognition and structured document understanding tasks. This represents a convergence between the PaddlePaddle framework and the Transformers library for document AI workloads.

Enterprise Deployment Patterns Agent and Tool Ecosystem PaddlePaddle PaddleOCR Hugging Face Transformers +1 more

4Github Trending·24d ago·source ↗

GLM-OCR: Fast and Accurate OCR System from zai-org

GLM-OCR is an open-source OCR project from zai-org built on the GLM model family, positioning itself as accurate, fast, and comprehensive. The repository has accumulated 6,787 GitHub stars with 82 added today, indicating notable community traction. It represents an application of large language/vision models to document understanding and text recognition tasks.

Open Weights Progress Multimodal Progress zai-org GLM-OCR GLM

7Mistral Ai News·19d ago·source ↗

Mistral OCR: New Document Understanding API with State-of-the-Art Benchmark Performance

Mistral AI has released Mistral OCR, an Optical Character Recognition API designed for deep document understanding, handling text, tables, equations, images, and complex layouts from PDFs and images. The model claims top benchmark scores across math, multilingual, scanned, and table categories, outperforming Google Document AI, Azure OCR, Gemini 1.5/2.0, and GPT-4o on an internal test set. It is priced at 1000 pages per dollar (with batch inference doubling that), available via la Plateforme API today, and is already deployed as the default document understanding model in Le Chat. A selective self-hosting option is offered for organizations with sensitive data requirements.

Inference Economics Enterprise Deployment Patterns Mistral AI Azure OCR Gemini 1.5 Pro +8 more

6Mistral Ai News·1mo ago·source ↗

Mistral OCR 3: New Frontier in Document Processing Accuracy and Efficiency

Mistral AI has released Mistral OCR 3 (model ID: mistral-ocr-2512), claiming a 74% overall win rate over its predecessor Mistral OCR 2 across forms, scanned documents, complex tables, and handwriting. The model supports markdown output with HTML-based table reconstruction and is priced at $2 per 1,000 pages ($1 with Batch API). It now powers the Document AI Playground in Mistral AI Studio, offering a drag-and-drop interface for parsing PDFs and images into text or structured JSON.

Inference Economics Enterprise Deployment Patterns Mistral AI Document AI Playground Mistral Studio +2 more

4Hugging Face Blog·1mo ago·source ↗

Welcome PaddlePaddle to the Hugging Face Hub

Hugging Face announced the integration of PaddlePaddle, Baidu's open-source deep learning framework, into the Hugging Face Hub. This expands the Hub's ecosystem to support PaddlePaddle models alongside existing frameworks like PyTorch and TensorFlow. The move broadens access to Chinese-developed AI models and tooling within the broader ML community.

Open Weights Progress Agent and Tool Ecosystem PaddlePaddle Baidu Hugging Face

4Github Trending·22d ago·source ↗

MinerU: Document-to-LLM-Ready Markdown/JSON Conversion Tool

MinerU is an open-source Python tool by OpenDataLab that converts complex documents (PDFs, Office files) into structured markdown or JSON formats optimized for LLM and agentic workflows. The repository has accumulated 65,610 GitHub stars with 180 new stars today, indicating sustained community traction. It targets a common preprocessing bottleneck in RAG and agent pipelines.

Agent and Tool Ecosystem MinerU OpenDataLab

4Hugging Face Blog·1mo ago·source ↗

Open-Source Text Generation & LLM Ecosystem at Hugging Face

Hugging Face published a blog post surveying the open-source LLM ecosystem as of mid-2023, covering text generation models, tooling, and deployment patterns available on the platform. The post highlights the breadth of open-weight models and associated infrastructure for inference and fine-tuning. It serves as a reference overview of the state of open-source LLMs at that point in time.

Open Weights Progress Inference Economics Hugging Face +1 more

3Github Trending·1mo ago·source ↗

vLLM: High-Throughput LLM Inference and Serving Engine Trending on GitHub

vLLM is an open-source Python library providing high-throughput and memory-efficient inference and serving for large language models. The project has accumulated over 80,500 GitHub stars with 98 new stars today, indicating continued strong community interest. It is a widely adopted inference backend in the AI/ML ecosystem, supporting PagedAttention and various optimization techniques for LLM deployment.

Inference Economics Agent and Tool Ecosystem vllm-project vLLM