📌 Big Techday 25: Sparse Models are the future: A deep dive into Mixture-of-Experts - Daria Soboleva - скачать видео с ютуба бесплатно по ссылке

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Скачать видео с ютуб по ссылке или смотреть без блокировок на сайте: Big Techday 25: Sparse Models are the future: A deep dive into Mixture-of-Experts - Daria Soboleva в качестве 4k

У нас вы можете посмотреть бесплатно Big Techday 25: Sparse Models are the future: A deep dive into Mixture-of-Experts - Daria Soboleva или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:

Информация по загрузке:

Скачать mp3 с ютуба отдельным файлом. Бесплатный рингтон Big Techday 25: Sparse Models are the future: A deep dive into Mixture-of-Experts - Daria Soboleva в формате MP3:

Если кнопки скачивания не загрузились НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу страницы.
Спасибо за использование сервиса ClipSaver.ru

Big Techday 25: Sparse Models are the future: A deep dive into Mixture-of-Experts - Daria Soboleva

Sparse Models are the future: A deep dive into Mixture-of-Experts The limits of scalability have been reached. AI training compute has increased by 10^21 since AlexNet, but these models can’t just get bigger forever. The most powerful language models today use less than 10% of their parameters for any given token, achieving significant computational savings while maintaining high quality. The efficiency comes from Mixture-of-Experts (MoE) architectures, which route different inputs to specialized expert networks instead of activating all parameters, saving compute. Drawing from latest trillion-parameter model design choices, this talk will cover why sparse architectures through MoE represent the most viable path for efficient AI scaling in production systems. About the speaker: Daria Soboleva works as Head Research Scientist at Cerebras, focusing on efficient AI systems and Large Language Models. She leads research on new LLM architectures, with a particular interest in Mixture-of-Experts models and hardware-optimized training. Furthermore, she is the creator of SlimPajama, a 627B token dataset that has become an industry standard with over 1M downloads, and BTLM-3B-8K, which achieved 7B parameter performance with significantly less compute. Previously, Daria worked at Google and other tech giants, building diverse expertise in ML and software engineering. Her research interests span efficient scaling of language models, data quality optimization, and specialized hardware architectures for AI. Daria holds a Master's degree in Computer Science from Moscow State University with specialization in AI and Machine Learning.

Comments