Скачать с ютуб видео A Walkthrough of In-Context Learning and Induction Heads Part 1 of 2 (w/ Charles Frye)

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Скачать видео с ютуб по ссылке или смотреть без блокировок на сайте: A Walkthrough of In-Context Learning and Induction Heads Part 1 of 2 (w/ Charles Frye) в качестве 4k

У нас вы можете посмотреть бесплатно A Walkthrough of In-Context Learning and Induction Heads Part 1 of 2 (w/ Charles Frye) или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:

Информация по загрузке:

Скачать mp3 с ютуба отдельным файлом. Бесплатный рингтон A Walkthrough of In-Context Learning and Induction Heads Part 1 of 2 (w/ Charles Frye) в формате MP3:

Если кнопки скачивания не загрузились НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу страницы.
Спасибо за использование сервиса ClipSaver.ru

A Walkthrough of In-Context Learning and Induction Heads Part 1 of 2 (w/ Charles Frye)

A walkthrough of the Anthropic paper In-Context Learning and Induction Heads. Charles Frye and I read through the paper, discuss and give intuitions. Part 2 coming soon! I was a core research contributor on this paper, but Chris Olah, Nelson Elhage and Catherine Olsson deserve far more of the credit! The paper: https://transformer-circuits.pub/2022... Timestamps 00:00:00 Intro 00:01:23 Ch1: Themes and high-level takes 00:01:23 Ch1a: Why mechanistic interpretability? 00:03:28 Ch1b: Why in-context learning? 00:08:11 Ch1c: Universality and in-context learning 00:11:44 Ch1d: Phase transitions and micro/macro lenses 00:14:11 Ch1e: Interpretability during training 00:18:10 Ch1f: Alignment, deployment, and interpretability 00:21:37 Ch2: Recap of arguments 00:22:39 Ch2a: Argument 1 - Macroscopic Co-occurrence 00:22:58 Ch2b: Argument 2 - Macroscopic Co-perturbation 00:24:20 Ch2c: Argument 3 - Direct Ablation 00:24:56 Ch2d: Argument 4 - Specific Examples of Generality 00:26:49 Ch2e: Argument 5 - Mechanistic Plausibility of Generality 00:28:25 Ch2f: Argument 6 - Continuity from Small to Large Models 00:29:13 Ch2g: Per-token loss analysis with PCA 00:35:21 Ch3: Argument 1 - Macroscopic phase change co-occurrence 00:36:41 Ch3a: Aside: Few-shot learning vs in-context learning 00:41:45 Ch3b: Figure - Derivative of loss with respect to token index 00:44:18 Ch3c: Figure - Induction heads from in phase change 00:50:57 Ch3d: Figure - Loss curves diverge during training 00:53:03 Ch3e: Figure - Per-token losses before and after the phase change 00:58:24 Ch3f: Assessing the evidence

Comments