У нас вы можете посмотреть бесплатно Deep Dive: Model Distillation with DistillKit или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
In this deep dive video, we zoom in on model distillation, an advanced technique to build high-performance small language models at a reasonable cost. First, we explain what a model distillation is. Then, we introduce two popular strategies for distillation, logits distillation and hidden states distillation. We study in detail how they work and how they're implemented in the Arcee DistillKit open-source library. Finally, we look at two Arcee models built with distillation, Arcee SuperNova 70B and Arcee SuperNova Medius 14B. Note: my calculation at 18:45 is wrong. It's 2.3 Tera tokens, not 2.3 Peta tokens. Sorry about that 🤡 If you’d like to understand how Arcee AI can help your organization build scalable and cost-efficient AI solutions, please get in touch at sales@arcee.ai or by booking a demo at https://www.arcee.ai/book-a-demo. ⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos. You can also follow me on Medium at / julsimon or Substack at https://julsimon.substack.com. ⭐️⭐️⭐️ Slides: https://fr.slideshare.net/slideshow/d... DistillKit: https://github.com/arcee-ai/DistillKit 00:00 Introduction 00:30 What is model distillation? 04:55 Model distillation with DistillKit 11:20 Logits distillation 20:10 Logits distillation with DistillKit 26:10 Hidden states distillation 31:35 Hidden states distillation with DistillKit 36:00 Pros and cons 40:32 Distillation example: Arcee SuperNova 70B 42:50 Distillation example: Arcee SuperNova Medius 14B 44:40 Conclusion