У нас вы можете посмотреть бесплатно RLHF Explained: How ChatGPT Learns from Humans (And Why It Breaks) или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
How do you train AI on tasks with no "correct answer"—like writing jokes or summaries? RLHF (Reinforcement Learning from Human Feedback) is the technique that makes ChatGPT helpful, but it comes with a fatal flaw: reward hacking. Learn why AI systems inevitably find ways to game their own training, and what this means for AI capabilities. 📚 KEY CONCEPTS COVERED: • Why human evaluation doesn't scale for training AI (you can't hire humans to judge billions of outputs) • How RLHF trains a "reward model" to simulate human judgment • The discriminator-generator gap: why ranking outputs is easier than creating perfect ones • Reward hacking: how RL finds adversarial inputs that exploit the reward model • Why RLHF must stop early—and what this means for AI improvement limits • The crucial difference between verifiable domains (chess, math) and unverifiable domains (creativity, helpfulness) ⏱️ TIMESTAMPS: 0:00 - The "the the the the" problem 0:30 - Verifiable vs. unverifiable tasks 1:15 - Why human evaluation doesn't scale 1:45 - The RLHF solution explained 3:15 - The discriminator-generator gap 4:15 - Reward hacking collapse 5:45 - RLHF vs. "real" RL comparison ━━━━━━━━━━━━━━━━━━━━━━━━ 📖 ORIGINAL SOURCE This video distills concepts from: • Deep Dive into LLMs like ChatGPT Full credit to the original creator for the educational content. Please visit the source for the complete lecture. ━━━━━━━━━━━━━━━━━━━━━━━━ 🎓 ABOUT LECTURE DISTILLED "Long lectures. Short videos. Core insights." We distill lengthy academic lectures into focused concept videos that capture the essential ideas. Perfect for students, researchers, and curious minds who want to understand complex topics efficiently. 🔗 GitHub: https://github.com/Augustinus12835/au... #RLHF #ReinforcementLearning #MachineLearning #ChatGPT #AITraining #RewardHacking #ArtificialIntelligence #DeepLearning