LoCoMo
locomo-0edfe856·3 events·first seen 20d agoAliases: LoCoMo
Co-occurring entities
More like this (12)
Recent events (3)
ConvMemory v2: Recall-preserving cross-encoder reranker for conversational memory retrieval
ConvMemory v2 is a fine-tuned cross-encoder reranker (22M parameters, based on ms-marco-MiniLM-L-6-v2) that reorders the top-10 candidates from the prior ConvMemory v1 system without changing which memories are retrieved, preserving Recall@10 by construction. On the LoCoMo conversational memory benchmark, v2 raises MRR from 0.5824 to 0.6560 and Hit@1 from 0.4440 to 0.5474, closing most of the gap to a much more expensive full-pool cross-encoder baseline. An ablation study confirms that candidate-specific memory text is the key mechanism driving the improvement.
FluxMem: Connectivity-Evolving Memory Framework for LLM Agents
FluxMem proposes a heterogeneous graph-based memory framework for LLM agents that continuously evolves its topology through three stages: initial connection formation, feedback-driven refinement, and long-term consolidation. Unlike static memory repositories, FluxMem repairs missing links, prunes interference, aligns abstraction granularity, and distills successful trajectories into reusable procedural circuits. The system is guided by a single metric for memory generalizability and evolutionary maturity, achieving state-of-the-art results on LoCoMo, Mind2Web, and GAIA benchmarks.
EvoArena benchmark and EvoMem memory paradigm for LLM agents in dynamic environments
Researchers introduce EvoArena, a benchmark suite that evaluates LLM agents in dynamic environments by modeling changes as progressive update sequences across terminal, software, and social domains. Alongside it, they propose EvoMem, a patch-based memory paradigm that records memory evolution as structured update histories to help agents reason about environmental change. Current agents score only 39.6% average accuracy on EvoArena, while EvoMem yields consistent gains on EvoArena and also improves performance on GAIA and LoCoMo benchmarks. The work highlights a significant gap between static-benchmark performance and real-world dynamic deployment requirements.