У нас вы можете посмотреть бесплатно Temporal Difference Learning (including Q-Learning) | Reinforcement Learning Part 4 или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
The machine learning consultancy: https://truetheta.io Join my email list to get educational and useful articles (and nothing else!): https://mailchi.mp/truetheta/true-the... Want to work together? See here: https://truetheta.io/about/#want-to-w... Part four of a six part series on Reinforcement Learning. As the title says, it covers Temporal Difference Learning, Sarsa and Q-Learning, along with some examples. SOCIAL MEDIA LinkedIn : / dj-rich-90b91753 Twitter : / duanejrich Github: https://github.com/Duane321 Enjoy learning this way? Want me to make more videos? Consider supporting me on Patreon: / mutualinformation SOURCES [1] R. Sutton and A. Barto. Reinforcement learning: An Introduction (2nd Ed). MIT Press, 2018. [2] H. Hasselt, et al. RL Lecture Series, Deepmind and UCL, 2021, • DeepMind x UCL | Deep Learning Lecture Ser... SOURCE NOTES The video covers topics from chapters 6 and 7 from [1]. The whole series teaches from [1]. [2] has been a useful secondary resource. TIMESTAMP 0:00 What We'll Learn 0:52 No Review 1:18 TD as an Adjusted Version of MC 2:49 TD Visualized with a Markov Reward Process 6:34 N-Step Temporal Difference Learning 8:08 MC vs TD on an Evaluation Example 11:50 TD's Trade-Off between N and Alpha 12:47 Why does TD Perform Better than MC? 15:29 N-Step Sarsa 17:15 Why have N above 1? 19:02 Q-Learning 20:50 Expected Sarsa 21:48 Cliff Walking 25:04 Windy GridWorld 28:12 Watch the Next Video! NOTES Code to compare TD vs MC on the evaluation task: https://github.com/Duane321/mutual_in...