FAISS
faiss-b9867cc2·2 events·first seen 28d agoAliases: FAISS
Co-occurring entities
More like this (12)
Recent events (2)
Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval
This Hugging Face blog post covers techniques for quantizing text embeddings to binary and scalar (int8) representations, enabling dramatically faster similarity search and reduced memory footprint. The post details how binary quantization can achieve ~40x memory reduction with Hamming distance search, while scalar quantization offers a middle ground between speed and accuracy. Practical implementation guidance is provided using Sentence Transformers and FAISS/USearch libraries, with benchmark results showing retrieval speed and accuracy tradeoffs.
SkillWeaver: Compositional Skill Routing for LLM Agents via Decompose-Retrieve-Compose
Researchers introduce SkillWeaver, a framework for compositional skill routing in LLM agents that decomposes complex queries into atomic sub-tasks, retrieves matching skills from a large library, and composes an executable DAG plan. The paper formalizes the Compositional Skill Routing problem and introduces CompSkillBench, a benchmark of 300 compositional queries over 2,209 real MCP server skills across 24 categories. A key finding is that task decomposition quality is the primary bottleneck, with standard LLM decomposition reaching only 34.2% category recall; the proposed Iterative Skill-Aware Decomposition (SAD) method improves decomposition accuracy from 51.0% to 67.7% in a single iteration. The framework also reduces context window consumption by over 99% compared to naive skill-stuffing approaches.