benchmark
IWSLT 2026 Cross-Lingual Voice Cloning
benchmarkactiveprovisional
iwslt-2026-cross-lingual-voice-cloning-b9ac78bd·1 events·first seen 9d agoAliases: IWSLT 2026 Cross-Lingual Voice Cloning
Co-occurring entities
More like this (12)
IWSLT 2026voice cloningsimultaneous speech-to-text translationLeveraging Audio-LLMs to Filter Speech-to-Speech Training DataSpeech-to-SpeechGLM-4-VoiceThe Shibboleth Effect: Auditing the Cross-Lingual Distributional Skew of Large Language ModelsVoxtral TTSAdaptive Turn-Taking for Real-time Multi-Party Voice AgentsCross-Modal Masking for Robust Silent Speech Synthesis Using sEMG and Lipreadingforeground-background dual-agent voice architectureMultilingual Coreference Resolution Shared Task
Recent events (1)
KIT submission to IWSLT 2026 cross-lingual voice cloning track with language tag prompting and RL fine-tuning
Researchers from KIT describe their system for the IWSLT 2026 Cross-Lingual Voice Cloning shared task, which aims to synthesize speech in a target language while preserving source-speaker identity. The system builds on FishAudio-S2-Pro, a multilingual TTS model, and introduces language tag prompting to reduce accent leakage, RL fine-tuning for intelligibility, and a reference-conditioned lexical matching method for domain-specific pronunciation. Language prompting yields the largest gains; lexical matching provides consistent improvements on matched subsets.