Almanac
← Events
4arXiv cs.CL (Computation and Language)·24h ago

Multi-agent semantic rewriting framework for privacy-preserving RAG

A new arXiv preprint proposes a three-agent framework for sanitizing retrieved content in RAG pipelines by performing privacy extraction, semantic analysis, and reconstruction as an offline preprocessing step. Evaluated on ChatDoctor and Wiki-PII datasets across six LLMs, the approach reduces targeted information exposure in LLaMA-3-8B from 144 baseline instances to 1, while maintaining contextual fidelity (BLEU-1 of 0.122 vs. SAGE's 0.117). The framework introduces no additional online inference latency since rewriting is done offline. Source code is publicly released.

Related guides (2)

Related events (8)

5arXiv · cs.LG·7d ago·source ↗

ReproRepo: Scalable LLM agent framework for reproducibility auditing using GitHub issues

ReproRepo is a new framework for evaluating LLM agents on reproducibility auditing of ML research, using naturally occurring GitHub issues as supervision signals rather than costly manual curation. The framework is instantiated on 1,149 recent ML papers from major conferences and benchmarks four frontier model-agent configurations. The best-performing agent (Codex with GPT-5.5) surfaces at least one semantically related human-reported reproduction blocker for ~90% of papers, though exact localization of issues remains a weakness. The work provides a reusable, scalable evaluation harness for this underexplored agentic task.

6Anthropic News·21d ago·source ↗

Anthropic introduces Contextual Retrieval to reduce RAG retrieval failures by up to 67%

Anthropic published a technical method called Contextual Retrieval that combines Contextual Embeddings and Contextual BM25 to address the context-loss problem in traditional RAG pipelines. The approach prepends chunk-level context before encoding, reducing failed retrievals by 49% standalone and 67% when combined with reranking. The post also highlights prompt caching as a simpler alternative for knowledge bases under 200K tokens, and provides a cookbook for deployment with Claude.

5Github Trending·1mo ago·source ↗

LEANN: RAG System with 97% Storage Savings for On-Device Private Retrieval

LEANN is an open-source retrieval-augmented generation (RAG) system targeting personal device deployment with claimed 97% storage reduction compared to conventional vector index approaches. The project is associated with MLsys 2026, suggesting an upcoming systems research paper. It emphasizes privacy through fully local execution and aims to maintain retrieval accuracy despite aggressive compression. The repository has accumulated over 11,000 stars with strong recent momentum.

4Github Trending·5d ago·source ↗

HippoRAG: RAG framework combining knowledge graphs and Personalized PageRank for continuous knowledge integration

HippoRAG is an open-source RAG framework published at NeurIPS 2024 by the OSU NLP Group that draws on models of human long-term memory to enable LLMs to continuously integrate knowledge across external documents. It combines retrieval-augmented generation with knowledge graphs and Personalized PageRank to improve multi-hop and associative retrieval. The repository has accumulated 3,742 GitHub stars with ongoing community traction.

4arXiv · cs.CL·7d ago·source ↗

HistoRAG: A RAG framework embedding historiographical methodology for historical research

Researchers introduce HistoRAG, a Retrieval-Augmented Generation framework that adapts RAG architecture to the epistemological requirements of historical scholarship. Key interventions include separated retrieval and generation, temporal windowing to ensure balanced source representation across time periods, and LLM-as-judge evaluation for transparent relevance judgments. The framework is evaluated on SPIEGELragged, a corpus of 102,189 Der Spiegel articles from 1950–1979, revealing concrete deficiencies in standard RAG for historical work (e.g., era-specific vocabulary failures, weak correlation between vector similarity and LLM-assessed relevance). The paper also introduces the concept of 'Zwischentexte' as a framework for responsible integration of LLM-generated text into scholarly practice.

6arXiv · cs.AI·1mo ago·source ↗

LCGuard: Adversarial Training Framework for Safe KV Cache Sharing in Multi-Agent LLM Systems

LCGuard introduces a framework for preventing sensitive information leakage when multi-agent LLM systems share KV caches as a latent communication channel. The approach formalizes leakage operationally via reconstruction: a shared cache artifact is deemed unsafe if an adversarial decoder can recover sensitive inputs from it. An adversarial training loop pits a reconstructor against LCGuard's representation-level transformations, which aim to preserve task-relevant semantics while suppressing recoverable sensitive content. Empirical results across multiple model families and multi-agent benchmarks show reduced reconstruction-based leakage and attack success rates with competitive task performance.

3Github Trending·7d ago·source ↗

RAGFlow open-source RAG engine with agent capabilities trending on GitHub

RAGFlow is an open-source Retrieval-Augmented Generation engine that combines RAG with agent capabilities, positioned as a context layer for LLMs. The project has accumulated over 83,000 GitHub stars with 111 new stars today, indicating sustained community interest. It is maintained by Infiniflow and represents a notable open-source tooling option in the RAG/agent ecosystem.

4arXiv · cs.CL·12d ago·source ↗

UMG-RAG: Training-free hybrid retrieval with uncertainty-aware granularity fusion for long-document RAG

Researchers propose Uncertainty-aware Multi-Granularity RAG (UMG-RAG), a training-free hybrid retrieval framework that addresses the tension between large and fine-grained retrieval chunks in RAG pipelines. The system converts dense and sparse retriever scores across multiple chunk granularities into evidence distributions, estimates reliability via entropy, and fuses candidates using query-specific confidence signals. A variant called UMGP-RAG uses fine-grained hits to locate evidence while returning broader parent chunks for coherence. Experiments on QA benchmarks show improved generation quality with no changes to the underlying retriever or generator.