У нас вы можете посмотреть бесплатно Why Training an AI Model is More Like Parenting than Programming (Anthropic engineer reveals all) или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
What if I told you that training AI isn't about giving it the right answers—it's about giving it the right rewards? ⭐️⭐️⭐️⭐️⭐️ Recently, I sat down with Hansohl Kim, a reinforcement learning engineer at Anthropic who pulled back the curtain & showed us how AI models actually learn 🎓 to behave the way they do. While most AI training uses supervised learning (basically: "here's the question, here's the correct answer"), reinforcement learning works more like raising a child. 👶 The model tries something, gets a reward or penalty, & gradually learns "good behavior" looks like—without ever being told the exact answer. This is how Claude learned its personality. In this conversation, Hansohl breaks down: 🔹 Why RL is essential for teaching AI systems values, not just facts 🔹 The challenge of getting models to stop doing harmful things 🔹 Whether AI models can actually "lie" or hide what they know 🔹 Why the future of AI isn't smarter models—it's agents that think, plan, & iterate over time If you've ever wondered what really happens inside companies like Anthropic, or how engineers are trying to make sure AI stays aligned with human values ❤️ as it gets more powerful, this is the conversation for you. CHAPTERS 00:00 Introduction 00:55 Journey into AI and Anthropic 03:17 From Inference to Reinforcement Learning 04:21 Understanding Reinforcement Learning 11:57 Setting Guardrails in AI 14:09 Anti-rewards are a very blunt approach 16:38 The first training environments were games 17:35 Behaviorism, Motivation & Why Models Can Lie 22:01 Beyond Reinforcement Learning 23:04 The Rise of AI Agents and Multi-Agent Systems 25:17 Conclusion and Final Thoughts KEY INSIGHTS 🎯 Reinforcement learning focuses on feedback and alignment rather than right-or-wrong answers. 🎯 Designing high-quality environments is as crucial to RL as good data is to supervised learning. 🎯 AI models can display emergent behaviors — including self-correction, concealment, and strategic reasoning. 🔔 DON’T MISS OUT Subscribe and hit the bell for more deep conversations on AI, innovation & the future of intelligent systems. 📌 RELATED LINKS 🌐 Anthropic – https://www.anthropic.com 💼 Hansohl Kim on LinkedIn – / hansohlkim #AI #reinforcementlearning #anthropic #artificialintelligence #machinelearning #gamethinking ------------------------------------------------------------------------------------ 📚 ABOUT OUR CHANNEL📚 We deconstruct breakout AI tools to help you innovate smarter and find product/market fit. Hosted by Amy Jo Kim - Game Designer & Startup Coach - prev. Rock Band, The Sims, Covet Fashion Happify, Netflix. Check out our channel here: / gamethinkingtv 🔔 Don’t forget to subscribe! 🔔 LEARN MORE ABOUT GAME THINKING Check out our rapid innovation programs for product leaders.👍 https://www.gamethinking.io/programs Join our free online community 📣 and get in on exclusive free events at https://gamethinking.io/gschool Read our Game Thinking book 📘 at https://gamethinking.io/book/ FIND US AT 👇 https://gamethinking.io/ GET IN TOUCH 👍 [email protected] FOLLOW US ON SOCIAL 📱 Get updates or reach out to Get updates on our Social Media Profiles! https://x.com/amyjokim / amyjokim / amyjokim Game Thinking TV