Almanac
technique

M³Proctor

techniqueactiveprovisionalm-proctor-8a64fe1d·1 events·first seen 9d ago

Aliases: M³Proctor

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.CL·9d ago·source ↗

M³Exam: Benchmark for Multimodal Memory in Realistic User-Agent Interactions

Researchers introduce M³Exam, a query-centric multimodal conversational memory benchmark designed to evaluate language agents on realistic user-agent interactions, including cross-modal grounding and implicit information inference. Existing benchmarks are critiqued for assuming sparse visuals and human-human interaction formats. The paper also proposes M³Proctor, a companion memory method that detects query modality bias and retrieves raw visual sources on demand, achieving 13% accuracy improvement while reducing index-construction time and retrieved tokens by over 70%.