У нас вы можете посмотреть бесплатно Core Concepts: Interactive No-Regret Learning или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
We explore the concept of interactive learning. Robots must interact with the world to gather data on which they learn. The principled way to learn when your data can be changing, possibly adversarially, is by striving to be "no regret", i.e., do as well as the best policy in hindsight. But greedily picking the best policy in hindsight fails, even on the simplest of examples! Join us as we journey through the world of games to understand why and how the simple act of hedging not only achieves "no regret", but unlocks some of the most powerful algorithms in the universe! We acknowledge Drew Bagnell for many insightful conversations on this topic. Check out the full series "Core Concept in Robotics": • Core Concepts in Robotics For a deeper dive, check out the series "Imitation Learning: A Series of Deep Dives": • Imitation Learning: A Series of Deep Dives References: 1. Drew Bagnell lecture notes: http://www.cs.cmu.edu/~16831-f14/note... 2. Blum et al. "On-Line Algorithms in Machine Learning" https://www.cs.cmu.edu/~ninamf/ML10/o... 3. Shai Shalev-Shwartz et al. "Online Learning and Online Convex Optimization" https://www.cs.huji.ac.il/w~shais/pap... 4. Arora et al. "The Multiplicative Weights Update Method: a Meta Algorithm and Applications" https://www.cs.princeton.edu/~arora/p... 5. Kakade et al. "Mind the Duality Gap: Logarithmic regret algorithms for online optimization" https://proceedings.neurips.cc/paper/...