product
DramaSR-LRM
productactiveprovisional
dramasr-lrm-0bef7b9c·1 events·first seen 16h agoAliases: DramaSR-LRM
Co-occurring entities
More like this (12)
Recent events (1)
DramaSR-LRM: Reasoning LLM with multimodal tool-use for speaker recognition in TV dramas
Researchers introduce DramaSR-532K, a large-scale benchmark of 532K annotated dialogue lines across 900+ characters from long-form TV dramas, targeting multimodal speaker recognition. They also propose DramaSR-LRM, a system built on a large reasoning model that uses multimodal tool-use to aggregate auditory, linguistic, and visual cues for speaker attribution. The approach significantly outperforms baselines, especially on short utterances where acoustic biometrics alone are unreliable. Data and code are to be publicly released.