У нас вы можете посмотреть бесплатно An introduction to Policy Gradient methods - Deep Reinforcement Learning или скачать в максимальном доступном качестве, которое было загружено на ютуб. Для скачивания выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса savevideohd.ru
In this episode I introduce Policy Gradient methods for Deep Reinforcement Learning. After a general overview, I dive into Proximal Policy Optimization: an algorithm designed at OpenAI that tries to find a balance between sample efficiency and code complexity. PPO is the algorithm used to train the OpenAI Five system and is also used in a wide range of other challenges like Atari and robotic control tasks. If you want to support this channel, here is my patreon link: / arxivinsights --- You are amazing!! ;) If you have questions you would like to discuss with me personally, you can book a 1-on-1 video call through Pensight: https://pensight.com/x/xander-steenbr... Links mentioned in the video: ⦁ PPO paper: https://arxiv.org/abs/1707.06347 ⦁ TRPO paper: https://arxiv.org/abs/1502.05477 ⦁ OpenAI PPO blogpost: https://blog.openai.com/openai-baseli... ⦁ Aurelien Geron: KL divergence and entropy in ML: • A Short Introduction to Entropy, Cros... ⦁ Deep RL Bootcamp - Lecture 5: • Deep RL Bootcamp Lecture 5: Natural ... ⦁ RL-adventure PyTorch implementation: https://github.com/higgsfield/RL-Adve... ⦁ OpenAI Baselines TensorFlow implementation: https://github.com/openai/baselines