Almanac
← Events
4GitHub Trending (AI/LLM filtered)·13d ago

TurboVec: high-performance vector index built on TurboQuant with Rust/Python bindings

TurboVec is an open-source vector index library implemented in Rust with Python bindings, built on top of TurboQuant. The project has accumulated 7,019 GitHub stars with 1,533 added in a single day, indicating significant community interest. It targets high-performance approximate nearest neighbor search, a core component of RAG and embedding-based retrieval pipelines.

Related guides (2)

Related events (8)

5Hugging Face Blog·1mo ago·source ↗

Introducing Optimum: The Optimization Toolkit for Transformers at Scale

Hugging Face announced Optimum, an optimization toolkit designed to accelerate Transformers models on various hardware backends. The toolkit aims to bridge the gap between Transformers model development and hardware-specific optimizations from partners. It provides a unified interface for quantization, pruning, and hardware-accelerated inference across different accelerators.

4Hugging Face Blog·1mo ago·source ↗

CPU Optimized Embeddings with Optimum Intel and fastRAG

Hugging Face and Intel demonstrate CPU-optimized embedding inference using Optimum Intel and fastRAG, targeting RAG pipeline acceleration without GPU hardware. The post covers quantization and optimization techniques that improve embedding throughput on Intel CPUs. This is relevant to inference economics and enterprise deployment patterns where GPU availability is constrained.

6arXiv · cs.AI·25d ago·source ↗

Channel-wise Vector Quantization (CVQ): A New Image Tokenization Paradigm with Next-Channel Prediction

Researchers introduce Channel-wise Vector Quantization (CVQ), which replaces conventional patch-wise discrete tokens with channel-wise tokens that represent an image as discrete levels of visual detail. Built on CVQ, the Channel-wise Autoregressive (CAR) model uses a 'next-channel prediction' objective, generating images by progressively refining from global structure to fine-grained attributes. CVQ achieves 100% codebook utilization with a 16K+ codebook and the CAR model scores 86.7 on DPG and 0.79 on GenEval for text-to-image generation. The approach offers a structural alternative to raster-order patch-based autoregressive image generation.

5Hugging Face Blog·1mo ago·source ↗

Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval

This Hugging Face blog post covers techniques for quantizing text embeddings to binary and scalar (int8) representations, enabling dramatically faster similarity search and reduced memory footprint. The post details how binary quantization can achieve ~40x memory reduction with Hamming distance search, while scalar quantization offers a middle ground between speed and accuracy. Practical implementation guidance is provided using Sentence Transformers and FAISS/USearch libraries, with benchmark results showing retrieval speed and accuracy tradeoffs.

4Github Trending·7d ago·source ↗

Vercel AI SDK: open-source TypeScript toolkit for AI-powered applications and agents

Vercel's AI SDK is an open-source TypeScript library for building AI-powered applications and agents, created by the team behind Next.js. The repository has accumulated 24,842 GitHub stars with modest daily growth (+11 today). It represents a widely-adopted tooling layer for integrating LLMs into TypeScript/JavaScript applications.

7arXiv · cs.AI·29d ago·source ↗

Vector Policy Optimization: Training for Diversity Improves Test-Time Search

Vector Policy Optimization (VPO) is a new RL post-training algorithm for LLMs that replaces the scalar reward paradigm with vector-valued rewards, explicitly training models to produce diverse solution sets that specialize across different reward trade-offs. VPO is designed as a near-drop-in replacement for the GRPO advantage estimator and targets inference-scaling search procedures like AlphaEvolve. Across four tasks, VPO matches or outperforms scalar RL baselines on pass@k and best@k metrics, with advantages growing as search budget increases, and unlocks evolutionary search problems that GRPO-trained models cannot solve. The paper argues that diversity-optimized post-training may need to become the default as inference-time search becomes standard.

5Hugging Face Blog·1mo ago·source ↗

Quanto: a PyTorch quantization backend for Optimum

Hugging Face introduced Quanto, a new PyTorch-based quantization backend integrated into the Optimum library. Quanto supports multiple quantization schemes and data types, targeting efficient inference for large language models and other neural networks. The tool is designed to work across hardware backends and integrates with the Hugging Face ecosystem.

4Github Trending·17d ago·source ↗

Vibe-Trading: open-source personal trading agent framework gains traction on GitHub

Vibe-Trading is a Python-based open-source trading agent project from HKUDS (Hong Kong University) that has accumulated 9,642 GitHub stars with 221 added in a single day. The project positions itself as a personal AI trading agent. The rapid star growth signals community interest in AI-driven autonomous trading systems.