Multimodal Speech Summarization through Semantic Concept Learning - (3 minutes introduction) скачать в хорошем качестве

Multimodal Speech Summarization through Semantic Concept Learning - (3 minutes introduction) 3 года назад

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Скачать видео с ютуб по ссылке или смотреть без блокировок на сайте: Multimodal Speech Summarization through Semantic Concept Learning - (3 minutes introduction) в качестве 4k

У нас вы можете посмотреть бесплатно Multimodal Speech Summarization through Semantic Concept Learning - (3 minutes introduction) или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:

Информация по загрузке:

Скачать mp3 с ютуба отдельным файлом. Бесплатный рингтон Multimodal Speech Summarization through Semantic Concept Learning - (3 minutes introduction) в формате MP3:

Если кнопки скачивания не загрузились НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу страницы.
Спасибо за использование сервиса ClipSaver.ru

Multimodal Speech Summarization through Semantic Concept Learning - (3 minutes introduction)

Title: Multimodal Speech Summarization through Semantic Concept Learning - (3 minutes introduction) Authors: Shruti Palaskar (Carnegie Mellon University, USA), Ruslan Salakhutdinov (Carnegie Mellon University, USA), Alan W. Black (Carnegie Mellon University, USA), Florian Metze (Carnegie Mellon University, USA) Category: Spoken Language Processing I Abstract: We propose a cascaded multimodal abstractive speech summarization model that generates semantic concepts as an intermediate step towards summarization. We describe a method to leverage existing multimodal dataset annotations to curate groundtruth labels for such intermediate concept modeling. In addition to cascaded training, the concept labels also provide an interpretable intermediate output level that helps improve performance on the downstream summarization task. On the open-domain How2 data, we conduct utterance-level and video-level experiments for two granularities of concepts: Specific and Abstract. We compare various multimodal fusion models for concept generation based on the respective input modalities. We observe consistent improvements in concept modeling by using multimodal adaptation models over unimodal models. Using the cascaded multimodal speech summarization model, we see a significant improvement of 7.5 METEOR points and 5.1 ROUGE-L points compared to previous methods of speech summarization. Finally, we show the benefits of scalability of the proposed approaches on 2000 h of video data. For more details and PDF version of the paper visit: https://www.isca-speech.org/archive/i... d01s18t08trim

Comments