У нас вы можете посмотреть бесплатно Lecture 7: Imitation Learning Through a Bayesian Lens или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
In this seventh lecture, we look at imitation learning in a Bayesian setting where we have a prior over possible cost functions the human may prefer. We show that the problem, fundamentally on of exploration vs exploitation, is intractable and explore a couple remedies. The first is to simplify the problem down to Bayesian Active learning where we show efficient greedy algorithms can be near-optimal. The second is to change the problem to a game and show that simple posterior sampling is in fact no-regret. Both options result in simple, intuitive algorithms that efficiently collapse uncertainty about the human's latent cost function. For more information about me and my work, check out http://www.sanjibanchoudhury.com/ References: 1. Hsu et al. "What Makes Some POMDP Problems Easy to Approximate?" https://papers.nips.cc/paper/2007/fil... 2. Somani et al. "DESPOT: Online POMDP Planning with Regularization" https://papers.nips.cc/paper/2013/fil... 3. Javdani "Shared Autonomy via Hindsight Optimization" https://arxiv.org/abs/1503.07619 4. Golovin et al. "Near-Optimal Bayesian Active Learning with Noisy Observations " https://arxiv.org/abs/1010.3091 5. Osband et al. "(More) Efficient Reinforcement Learning via Posterior Sampling" https://arxiv.org/abs/1306.0940 6. Golovin and Krause "Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization" https://arxiv.org/abs/1003.3967 7. Sadigh et al. "Active Preference-Based Learning of Reward Functions" https://people.eecs.berkeley.edu/~sas... 8. Russo et al. "A Tutorial on Thompson Sampling" https://web.stanford.edu/~bvr/pubs/TS...