У нас вы можете посмотреть бесплатно Deriving the Policy Gradient Theorem and REINFORCE или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Code: https://github.com/priyammaz/PyTorch-... Prereqs: Bellman Equation Derivation • Mathing the Bellman Equation: Derivation o... Monte Carlo Methods • Monte Carlo Methods for Model-Free Learnin... Awesome Resources: Blogpost but Lilian Weng https://lilianweng.github.io/posts/20... Sutton and Barto Book http://incompleteideas.net/book/RLboo... Today we are exploring one of the most important shifts in RL, Policy Gradients! Until now, methods we have looked at like Q-Learning had 2 stages. The first stage is to have a neural network estimate Q values, and the second stage was to derive the Policy from those Q values (typically greedy methods). But lets cut out the middle, can we just train a model to directly estimate the policy instead? Yes! That is what we learn here today in Policy Gradients and the REINFORCE Method! We will spend the majority of this video deriving the Policy Gradient Theorem, but then the implementation is pretty easy after! Timestamps: 00:00:00 - Q Learning to Policy Networks 00:02:10 - Stationary Distributions 00:06:00 - What is our Cost Function? 00:10:45 - Derive the Policy Gradient Theorem 00:12:29 - Derivative of the Value Function 00:17:30 - What constants can we ignore? 00:23:00 - Simplifying the Recursion 00:32:00 - Exploiting the Recursion to simplify! 00:41:50 - Where did the stationary distribution go? 00:43:50 - Unveiling the stationary distribution!! 00:50:50 - Wrapping up the derivation 00:57:10 - REINFORCE Algorithm (Monte-Carlo Methods) 01:03:50 - Implementation 01:22:20 - Results 01:23:00 - Recap Socials! X / data_adventurer Instagram / nixielights Linkedin / priyammaz Discord / discord 🚀 Github: https://github.com/priyammaz 🌐 Website: https://www.priyammazumdar.com/