У нас вы можете посмотреть бесплатно #5: Quintin Pope - AI alignment, machine learning, failure modes, and reasons for optimism или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Quintin Pope is a machine learning researcher focusing on natural language modeling and AI alignment. Among alignment researchers, Quintin stands out for his optimism. He believes that AI alignment is far more tractable than it seems, and that we appear to be on a good path to making the future great. On LessWrong, he's written one of the most popular posts of the last year, “My Objections To ‘We're All Gonna Die with Eliezer Yudkowsky’”, as well as many other highly upvoted posts on various alignment papers, and on his own theory of alignment, shard theory. Quintin’s Twitter: / quintinpope5 Quintin’s LessWrong profile: https://www.lesswrong.com/users/quint... My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”: https://www.lesswrong.com/posts/wAczu... The Shard Theory Sequence: https://www.lesswrong.com/s/nyEFg3AuJ... Quintin’s Alignment Papers Roundup: https://www.lesswrong.com/s/5omSW4wNK... Evolution provides no evidence for the sharp left turn: https://www.lesswrong.com/posts/hvz9q... Deep Differentiable Logic Gate Networks: https://arxiv.org/abs/2210.08277 The Hydra Effect: Emergent Self-repair in Language Model Computations: https://arxiv.org/abs/2307.15771 Deep learning generalizes because the parameter-function map is biased towards simple functions: https://arxiv.org/abs/1805.08522 Bridging RL Theory and Practice with the Effective Horizon: https://arxiv.org/abs/2304.09853 PODCAST LINKS: Video Transcript: https://www.theojaffee.com/p/5-quinti... Spotify: https://open.spotify.com/show/1IJRtB8... Apple Podcasts: https://podcasts.apple.com/us/podcast... RSS: https://api.substack.com/feed/podcast... Playlist of all episodes: • Theo Jaffee Podcast My Twitter: https://x.com/theojaffee My Substack: https://www.theojaffee.com CHAPTERS: Introduction (0:00) What Is AGI? (1:03) What Can AGI Do? (12:49) Orthogonality (23:14) Mind Space (42:50) Quintin’s Background and Optimism (55:06) Mesa-Optimization and Reward Hacking (1:02:48) Deceptive Alignment (1:11:52) Shard Theory (1:24:10) What Is Alignment? (1:30:05) Misalignment and Evolution (1:37:21) Mesa-Optimization and Reward Hacking, Part 2 (1:46:56) RL Agents (1:55:02) Monitoring AIs (2:09:29) Mechanistic Interpretability (2:14:00) AI Disempowering Humanity (2:28:13)