У нас вы можете посмотреть бесплатно Distant conversational speech recognition: Challenges and Opportunities или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Host: Sunit Sivasankaran, Microsoft Research Speaker: Dr. Samuele Cornell, Carnegie Mellon University State-of-the-art ASR systems excel on close-talk benchmarks but struggle with far-field conversational speech, where error rates remain above 20%. Current benchmark datasets inadequately assess generalization across domains and real-world conditions, often relying on oracle segmentation that yields overly optimistic results. Distant ASR (DASR) faces unique challenges including overlapping speech, varied recording setups, and dynamic speaker interactions that significantly complicate system development. Despite these difficulties, spontaneous conversational speech represents the next frontier for developing more human-like AI agents capable of natural multi-party communication. This talk presents recent advances in DASR through three interconnected efforts: (1) the CHiME-7 and CHiME-8 DASR challenges, which established rigorous benchmarks for generalizable robust meeting transcription, (2) end-to-end joint modeling that unifies speaker diarization and speech recognition into a single framework, moving beyond traditional pipeline approaches, and (3) synthetic data generation leveraging large language models and text-to-speech systems to create realistic multi-speaker training data at scale.