У нас вы можете посмотреть бесплатно Atticus Geiger - State of Interpretability & Ideas for Scaling Up [Alignment Workshop] или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Atticus Geiger from Pr(Ai)²R Group explores “State of Interpretability & Ideas for Scaling Up,” outlining prediction, control, and understanding in AI interpretability. He then maps a path to advance interpretability research. Highlights: 🔹 Probes - Measures mutual information to interpret concepts 🔹 Steering - Directs model behavior via interventions 🔹 Sparse Autoencoders (SAEs) - Encodes complex data but limits feature specificity 🔹 Scaling Goals - Emphasizes counterfactual data, success criteria, and context-sensitive features The Alignment Workshop is a series of events convening top ML researchers from industry and academia, along with experts in the government and nonprofit sectors, to discuss and debate topics related to AI alignment. The goal is to enable researchers and policymakers to better understand potential risks from advanced AI, and strategies for solving them. If you are interested in attending future workshops, please fill out the following expression of interest form to get notified about future events: https://far.ai/futures-eoi Find more talks on this YouTube channel, and at https://www.alignment-workshop.com/ #AlignmentWorkshop