У нас вы можете посмотреть бесплатно Prof. Natasha Jaques: Multi-agent Reinforcement Learning (MARL) for LLMs или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Talk Title: Multi-agent Reinforcement Learning (MARL) for LLMs Speaker: Senior Research Scientist · Natasha Jaques · Google DeepMind Talk Abstract: Reinforcement Learning (RL) fine-tuning of Large Language Models (LLMs) has shown incredible promise, starting with RL from human feedback, and continuing into recent results using verifiable rewards for reasoning tasks. However, previous to the LLM era most major successes of RL were not single-agent, but used techniques like self-play to unlock continuous self-improvement. I will discuss how to apply these multi-agent techniques to LLMs, enabling scalable training to improve reasoning and provably safe LLMs. First, we introduce a self-play safety game, where an attacker and defender LLM co-evolve through a zero-sum adversarial game. The attacker attempts to find prompts which elicit an unsafe response from the defender, as judged by a reward model. Both agents use a hidden chain-of-thought to reason about how to develop and defend against attacks. Using well-known game theoretic results, we show that if this game converges to the Nash equilibrium, the defender will output a safe response for any string input. Empirically, we show that our approach produces a model that is safer than models trained with RLHF, while retaining core chatting and reasoning capabilities. Second, I will discuss how to use self-play on games to improve capabilities on math and reasoning benchmarks. Together, these results demonstrate the potential of online multi-agent RL training to enable continuous self-improvement and provable guarantees for LLMs. Bio: Natasha Jaques is an Assistant Professor of Computer Science and Engineering at the University of Washington, and a Staff Research Scientist at Google DeepMind. Her research focuses on Social Reinforcement Learning in multi-agent and human-AI interactions. During her PhD at MIT, she developed foundational techniques for training language models with Reinforcement Learning from Human Feedback (RLHF). In the multi-agent space, she has developed techniques for improving coordination through social influence, and unsupervised environment design. Natasha’s work has received various awards, including Best Demo at NeurIPS, an honourable mention for Best Paper at ICML, and the Outstanding PhD Dissertation Award from the Association for the Advancement of Affective Computing. Her work has been featured in Science Magazine, MIT Technology Review, Quartz, IEEE Spectrum, Boston Magazine, and on CBC radio, among others. Natasha earned her Masters degree from the University of British Columbia, undergraduate degrees in Computer Science and Psychology from the University of Regina, and was a postdoctoral fellow at UC Berkeley.