У нас вы можете посмотреть бесплатно Can Speech Be Tokenized Like Text? A Brief Exploration of Neural Codec Models или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Today we explore the general ideas behind modern Neural Codecs. They take a lot of cues from previous work on codecs, just reinterpreted for deep learning. Instead of hand-designed signal processing block such as filter banks, psychoacoustic models, and carefully engineered quantizers, we now learn these components directly from data. But the high-level ideas are surprisingly the same! The purpose of this video is to set the stage for a future video in reproducing the EnCodec model! Timestamps: 00:00:00 - Introduction 00:00:20 - What is a Codec? 00:02:15 - Bit Rates 00:02:35 - What is an MP3? 00:02:55 - Auditory Masking 00:05:20 - Psychoacoustics 00:07:35 - Linear Predictive Coding (LPC) 00:09:00 - Lots of autocorrelation! 00:10:00 - Quantization 00:11:13 - Bit Depth 00:12:30 - Residual Quantization and the Bit/Precision Tradeoff 00:15:40 - Residual Vector Quantization 00:20:50 - Summary 00:23:50 - Moving to Neural Codecs 00:24:20 - SoundStream Model 00:27:50 - EnCodec Model 00:28:45 - Loss Balancer 00:31:20 - An aside on Arithmetic Coding 00:33:15 - SpeechTokenizer Model 00:35:00 - Training Speech LLMs on Codecs 00:36:45 - VALLE Model 00:38:55 - What's next? Socials! X / data_adventurer Instagram / nixielights Linkedin / priyammaz Discord / discord 🚀 Github: https://github.com/priyammaz 🌐 Website: https://www.priyammazumdar.com/