У нас вы можете посмотреть бесплатно GPT-3: Language Models are Few-shot Learners или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
A slow description of "Language Models are Few-shot Learners", the paper that introduced GPT-3 model, by T. Brown et al., published at NeurIPS in 2020. Timestamps: 00:00 - GPT-3: Large Language Models are Few-shot Learners 00:26 - Outline 01:10 - Motivation: dropping fine-tuning 05:05 - Motivation: Meta-learning at scale 09:28 - Testing the scale hypothesis with GPT-3 14:40 - Approach and evaluation settings 17:18 - Evaluation settings 22:00 - Model and architectures 25:49 - Training data 30:51 - Common Crawl filtering details 33:17 - Training process 36:48 - Evaluation details 41:15 - Language modelling 42:39 - Language modelling/clozes 47:44 - Completion tasks 49:36 - Closed Book Question Answering 54:45 - Translation 01:00:44 - Winograd-Style Tasks 01:05:23 - Common Sense Reasoning 01:09:03 - Reading Comprehension 01:15:08 - SuperGLUE 01:23:21 - Natural Language Inference 01:26:01 - Few-shot Arithmetic 01:31:55 - Word scrambling and manipulation tasks 01:36:38 - SAT Analogies 01:38:52 - News Article Generation 01:44:29 - Further Analysis of News Article Generation 01:46:14 - Example generated articles 01:51:39 - Longer news articles 01:53:47 - Learning and Using Novel Words 01:58:11 - Grammar 02:02:24 - Poem generation 02:05:46 - Measuring and Preventing Memorisation of Benchmarks 02:13:25 - Further details for overlap analysis 02:14:38 - Limitations 02:22:43 - Broader Impacts, Misuse of Language Models 02:27:28 - External Incentive Structures 02:28:56 - Broader Impacts: Bias and Gender 02:33:39 - Gender and Race 02:39:29 - Religion, Future Bias and Fairness Challenges 02:45:00 - Energy usage 02:48:04 - Related Work 02:53:24 - Summary Detailed description: We begin with the two questions that motivate this work: (1) How can we remove the need for fine-tuning language models? (2) Does in-context learning show strong gains with increased scale? We describe "GPT-3", the 175 billion parameter model designed to investigate this latter question about scale in a previously unexplored regime. One key takeaway of this work is that the few-shot ability of language models appears to improve with greater model capacity. We then dig into the evaluation zero-shot, one-shot and few-shot evaluation settings used to evaluate GPT-3 and the low-level details of training and evaluation. We discuss results on a broad range of tasks, from arithmetic to writing poetry. We also talk through potential issues with the pretraining objective (which assigns equal weight to each token) and the comparatively low sample efficiency of GPT-3. We then turn to potential misuse applications, bias relating to gender, race and religion, and energy usage. Finally, we close with a discussion of links to prior work including transformers, differentiable meta-learning, the use of instructions in natural language for task unification and T5. Topics: #gpt-3 #ai #machinelearning #coding Slides (pdf): https://samuelalbanie.com/files/diges... The paper can be found on arxiv here: https://arxiv.org/abs/2005.14165 References for papers mentioned in the video can be found at http://samuelalbanie.com/digests/2022...