Entity · benchmark

IWSLT 2026 Cross-Lingual Voice Cloning

benchmarkactiveiwslt-2026-cross-lingual-voice-cloning-b9ac78bd·1 events·first seen Jun 8, 2026

Aliases: IWSLT 2026 Cross-Lingual Voice Cloning

Co-occurring entities

FishAudio-S2-Pro Karlsruhe Institute of Technology

More like this (12)

IWSLT 2026 voice cloning WordVoice simultaneous speech-to-text translation Interleaved Speech Language Models Latently Work In Text SpeechLLM Meets Federated Learning for End-to-End ASR: English and Italian Case Studies Same Lesson, Different Story: Cross-Lingual Reconstruction of Cultural Narratives in Large Language Models From Sinhala to Dhivehi: Cross-Lingual Transfer Learning for Low-Resource Speech Recognition CapSpeech-TTS Leveraging Audio-LLMs to Filter Speech-to-Speech Training Data Biomedical Machine Translation for Low-Resource Arabic-Script Languages via Cross-Lingual Transfer and LoRA Adapter Merging Speech-to-Speech

Recent events (1)

3arXiv · cs.CL·Jun 8, 2026·source ↗

KIT submission to IWSLT 2026 cross-lingual voice cloning track with language tag prompting and RL fine-tuning

Researchers from KIT describe their system for the IWSLT 2026 Cross-Lingual Voice Cloning shared task, which aims to synthesize speech in a target language while preserving source-speaker identity. The system builds on FishAudio-S2-Pro, a multilingual TTS model, and introduces language tag prompting to reduce accent leakage, RL fine-tuning for intelligibility, and a reference-conditioned lexical matching method for domain-specific pronunciation. Language prompting yields the largest gains; lexical matching provides consistent improvements on matched subsets.

Multimodal Progress IWSLT 2026 Cross-Lingual Voice Cloning FishAudio-S2-Pro Karlsruhe Institute of Technology