Скачать с ютуб видео Generalization and Robustness in Offline Reinforcement Learning

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Скачать видео с ютуб по ссылке или смотреть без блокировок на сайте: Generalization and Robustness in Offline Reinforcement Learning в качестве 4k

У нас вы можете посмотреть бесплатно Generalization and Robustness in Offline Reinforcement Learning или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:

Информация по загрузке:

Скачать mp3 с ютуба отдельным файлом. Бесплатный рингтон Generalization and Robustness in Offline Reinforcement Learning в формате MP3:

Если кнопки скачивания не загрузились НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу страницы.
Спасибо за использование сервиса ClipSaver.ru

Generalization and Robustness in Offline Reinforcement Learning

Wen Sun (Cornell University) https://simons.berkeley.edu/talks/tbd... Quantifying Uncertainty: Stochastic, Adversarial, and Beyond Offline Reinforcement Learning (RL) is a learning paradigm where the RL agent only learns from a pre-collected static dataset and cannot further interact with the environment anymore. Offline RL is a promising approach for safety-critical applications where randomized exploration is not safe. In this talk, we study offline RL in large scale settings with rich function approximation. In the first part of the talk, we will study the generalization property in offline RL and we will give a general model-based offline RL algorithm that provably generalizes in large scale Markov Decision Processes. Our approach is also robust in the sense that as long as there is a high-quality policy whose traces are covered by the offline data, our algorithm will find it. In the second part of the talk, we consider the offline Imitation Learning (IL) setting where the RL agent has an additional set of high-quality expert demonstrations. In this setting, we give an IL algorithm that learns with polynomial sample complexity and achieves start-of-art performance in standard continuous control robotics benchmark.

Comments