У нас вы можете посмотреть бесплатно RL 7: Monte-Carlo Method | Reinforcement Learning или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Monte-Carlo Method in Reinforcement Learning - In the previous video about policy iteration and value iteration we assumed that the agen has access to the model of the environment. However, this assumption is not true always. In this video, we discuss an approach called monte-carlo method (for prediction and control) using which an agent can improve its policy by interacting in the environment. We discuss a specific variant of Monte-Carlo method called "exploring start" where each episode starts from a randomly selected state-action pair. The algorithm basically uses the framework of generalized policy iteration to improve the policy iteratively. Reinforcement learning tutorial series: 1. Multi-armed Bandits: • RL 1: Multi-armed Bandits 1 2. Multi-Armed Bandits - Action value estimation: • RL 2: Multi-Armed Bandits 2 - Action value... 3. Upper confidence bound: • RL 3: Upper confidence bound (UCB) to solv... 4. Thompson Sampling: • RL 4: Thompson Sampling - Multi-armed bandits 5. Markov Decision Process - MDP: • RL 5: Markov Decision Process - MDP | Rein... 6. Policy iteration and value iteration: • RL 6: Policy iteration and value iteration... 7. Monte-Carlo Method: • RL 7: Monte-Carlo Method | Reinforcement L... #monte_carlo_method #reinforcement_learning