У нас вы можете посмотреть бесплатно DeepSeek-R1 Paper Explained - A New RL LLMs Era in AI? или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
In this video, we dive into the groundbreaking DeepSeek-R1 research paper, titled "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning". This paper introduces the models DeepSeek-R1-Zero and DeepSeek-R1, open-source reasoning models that rivals the performance of top-tier models like OpenAI's o1! Here's a quick overview of what we'll cover: Training a Large Language Model (LLM) using Reinforcement Learning (RL) only in post-training, without Supervised Fine-tuning (SFT). Rule-based Reinforcement Learning (RL) used DeepSeek-R1 for large-scale RL training. Intriguing insights including the "aha" moment. DeepSeek-R1 Training Pipeline Performance results Written review - https://aipapersacademy.com/deepseek-r1/ Paper - https://arxiv.org/abs/2501.12948 Project page - https://github.com/deepseek-ai/DeepSe... ----------------------------------------------------------------------------------------------- ✉️ Join the newsletter - https://aipapersacademy.com/newsletter/ 👍 Please like & subscribe if you enjoy this content Become a patron - / aipapersacademy The video was edited using VideoScribe - https://tidd.ly/44TZEiX ----------------------------------------------------------------------------------------------- Chapters: 0:00 Introduction 0:52 LLMs Training 2:20 RL-only LLM (DeepSeek-R1-Zero) 2:53 Rule-based RL 4:41 DeepSeek-R1-Zero Insights 5:41 DeepSeek-R1 Aha Moment 6:09 Training DeepSeek-R1 8:48 DeepSeek-R1 Results