У нас вы можете посмотреть бесплатно ⚡ Open Model Pretraining Masterclass — Elie Bakouch, HuggingFace SmolLM 3, FineWeb, FinePDF или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Today Elie Bakouch, who leads pre-training efforts at Hugging Face and is a key architect behind SmolLM, walks us through his five pillars of model training: data quality optimization, model architecture design, information extraction efficiency, gradient quality maximization, and training stability at scale. We also talked about their open science data work like FineWeb-Edu2 and FinePDF dataset, new Adam optimizer alternatives like Muon and Shampoo, and the evolution of Mixture of Experts (MoE) architectures. Elie broke down recent innovations from DeepSeek's granular routing mechanisms to Alibaba's Qwen models achieving unprecedented sparsity levels. 00:00:00 Introduction 00:01:10 Hugging Face Research Team Overview 00:04:20 The Unified View of Model Training 00:10:17 Optimizer Innovation: Beyond Adam 00:21:15 MoE Architecture Deep Dive 00:29:26 Expert Specialization and Routing 00:33:09 Sparsity Trends and Production Models 00:40:20 Data Quality and Rephrasing Revolution 00:43:20 Small LM Training Insights 00:53:56 Open Source Tools and Future Directions