Almanac
← Events
3GitHub Trending (AI/LLM filtered)·1mo ago

vLLM: High-Throughput LLM Inference and Serving Engine Trending on GitHub

vLLM is an open-source Python library providing high-throughput and memory-efficient inference and serving for large language models. The project has accumulated over 80,500 GitHub stars with 98 new stars today, indicating continued strong community interest. It is a widely adopted inference backend in the AI/ML ecosystem, supporting PagedAttention and various optimization techniques for LLM deployment.

Related guides (2)

Related events (8)

4Github Trending·10d ago·source ↗

LiteLLM AI gateway trending: 50K stars, unified interface for 100+ LLM APIs

LiteLLM is a Python SDK and proxy server providing a unified OpenAI-compatible interface to 100+ LLM APIs including Bedrock, Azure, OpenAI, VertexAI, Anthropic, and others. It includes cost tracking, guardrails, load balancing, and logging. The project is trending on GitHub with ~50K total stars and 141 new stars today, signaling continued strong adoption as an AI gateway layer.

4Github Trending·27d ago·source ↗

free-llm-api-resources: Curated List of Free LLM API Inference Endpoints

A GitHub repository maintained by cheahjs catalogues free LLM inference resources accessible via API, accumulating over 22,000 stars with 89 added today. The project serves as a community reference for developers seeking zero-cost access to hosted language model endpoints. High star count signals broad practitioner interest in inference cost reduction and accessible model APIs.

3Github Trending·8d ago·source ↗

mlx-lm: LLM inference library for Apple MLX framework trending on GitHub

mlx-lm is an open-source Python library for running LLMs using Apple's MLX framework, designed for Apple Silicon hardware. The repository has accumulated 5,817 stars with 43 new stars today, indicating steady community interest. It represents a key piece of the Apple-native ML inference ecosystem.

4Hugging Face Blog·1mo ago·source ↗

Optimizing your LLM in production

A Hugging Face blog post covering practical techniques for optimizing large language models in production environments. The post likely addresses inference efficiency methods such as quantization, batching, caching, and hardware utilization strategies. It serves as a practitioner-oriented guide for deploying LLMs at scale.

4Github Trending·8d ago·source ↗

LMCache: KV cache layer for LLM inference acceleration

LMCache is an open-source Python library providing a KV cache layer designed to accelerate LLM inference. The project has accumulated 8,613 GitHub stars with modest daily growth (+17). It targets inference efficiency by offloading or sharing KV cache state across requests.

4Github Trending·24d ago·source ↗

Langfuse: Open Source LLM Engineering Platform Trending on GitHub

Langfuse is an open-source LLM engineering platform providing observability, metrics, evaluations, prompt management, and dataset tooling. It integrates with OpenTelemetry, LangChain, OpenAI SDK, and LiteLLM. The project has accumulated 28,075 GitHub stars with 89 new stars today, indicating sustained community traction. Backed by Y Combinator (W23), it represents a notable entry in the LLM ops/tooling ecosystem.

4Github Trending·15d ago·source ↗

vllm-omni: framework for efficient inference with omni-modality models

The vllm-project has published vllm-omni, a Python framework extending vLLM's inference capabilities to omni-modality models. The repository has accumulated ~4,956 GitHub stars. It represents an expansion of the vLLM ecosystem into multimodal inference serving.

4Hugging Face Blog·1mo ago·source ↗

Open-Source Text Generation & LLM Ecosystem at Hugging Face

Hugging Face published a blog post surveying the open-source LLM ecosystem as of mid-2023, covering text generation models, tooling, and deployment patterns available on the platform. The post highlights the breadth of open-weight models and associated infrastructure for inference and fine-tuning. It serves as a reference overview of the state of open-source LLMs at that point in time.