У нас вы можете посмотреть бесплатно Self-Training improves Pre-Training for Natural Language Understanding или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
This video explains a new paper that shows benefits by Self-Training after Language Modeling to improve the performance of RoBERTa-Large. The paper goes on to show Self-Training gains in Knowledge Distillation and Few-Shot Learning as well. They also introduce an interesting unlabeled data filtering algorithm, SentAugment that improves performance and reduces the computational cost of this kind of self-training looping. Thanks for watching! Please Subscribe! Paper Links: Paper Link: https://arxiv.org/pdf/2010.02194.pdf Distributed Representations of Words and Phrases: https://papers.nips.cc/paper/5021-dis... Rethinking Pre-training and Self-training: https://arxiv.org/pdf/2006.06882.pdf Don't Stop Pretraining: https://arxiv.org/pdf/2004.10964.pdf Universal Sentence Encoder: https://arxiv.org/abs/1803.11175 Common Crawl Corpus: https://commoncrawl.org/the-data/ Fairseq: https://github.com/pytorch/fairseq BERT: https://arxiv.org/pdf/1810.04805.pdf Noisy Student: https://arxiv.org/abs/1911.04252 POET: https://arxiv.org/pdf/1901.01753.pdf PET - Small Language Models are Also Few-Shot Learners: https://arxiv.org/pdf/2009.07118.pdf Chapters: 0:00 Introduction 1:50 Background on Transfer Learning 2:40 Self-Training 5:25 Not all unlabeled data is equally useful 6:54 SentAugment Retrieval and Filtering 12:55 Experimental Data 14:55 Results 18:15 Some Interesting Details 19:02 Ablations 20:20 Nearest Neighbor Visualization 21:05 Computational Cost of Self-Training 22:30 Few-Shot Learning comparison with GPT-3, PET 23:52 Phases of Representation Learning