У нас вы можете посмотреть бесплатно AI Dubbing Demystified: Insights from TTS Expert | Voices of the Industry Ep8 w/ Álex Pérez или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
AI dubbing is everywhere right now—but very few people actually understand how these systems work under the hood. Are we really close to replacing human voice actors? Why do some AI-dubbed clips sound great while others feel… off? In this episode of “Voices of the Industry”, Belén sits down with Álex Pérez, text-to-speech scientist, to demystify AI dubbing, synthetic voices, and modern TTS / STS workflows. Together, they unpack the science (not just the marketing) behind AI dubbing and what it really takes to get natural, believable performances in multiple languages. 🎧 In this conversation, we cover: ✅What “AI dubbing” actually means (TTS vs. voice conversion vs. speech-to-speech) ✅The underlying tech: autoregressive vs. non-autoregressive TTS and transformer-based architectures ✅How TTS and STS models are really trained – and why good data is so hard to get ✅Hallucinations in AI-generated speech and why accent, style, and intent are still so tricky to control ✅Why voice acting and lip-sync remain the hardest parts of AI dubbing ✅Limitations of current tools for premium content (film, series, games) vs. e-learning, podcasts, etc. ✅The future: promptable TTS, scene-level generation, and multimodal AI systems handling dubbing end-to-end 🎙️ Hosted by: Belén Agulló from the AI Localization Think Tank 💡 Guest: Álex Pérez, Lead Text-to-Speech Scientist at Apptek --- 🔍In This Episode 00:00 Introduction 01:20 Meet Álex: Background and Expertise 03:42 Understanding AI Dubbing and Synthetic Voices 05:17 Technical Insights: TTS and Voice Conversion Models 13:38 Training AI Models for Dubbing 20:33 Challenges in AI Dubbing 30:49 Future of Synthetic Voices 34:44 Conclusion and Final Thoughts --- Álex Pérez LinkedIn Profile: / alexdemartos 📖Álex's selection of recent scientific papers on AI Dubbing: ➡️Microsoft's VibeVoice (auto-regressive): https://arxiv.org/pdf/2508.19205 ➡️F5-TTS (non-autoregressive): https://arxiv.org/pdf/2410.06885v1 ➡️Kyutai's DSM (auto-regressive, streaming/low-latency TTS): https://arxiv.org/pdf/2509.08753v1 👉 Subscribe to the AI Localization Think Tank channel and newsletter for more conversations like this. 📢 Join the discussion on LinkedIn and tell us: What do you think about synthetic voices and their impact in the localization industry?