У нас вы можете посмотреть бесплатно How to Build an LLM from Scratch | An Overview или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Get exclusive access to AI resources and project ideas: https://the-data-entrepreneurs.kit.co... This is the 6th video in a series on using large language models (LLMs) in practice. Here, I review key aspects of developing a foundation LLM based on the development of models such as GPT-3, Llama, Falcon, and beyond. More Resources: ▶️ Series Playlist: https://www.youtube.com/playlist?list... Read more: https://medium.com/towards-data-scien... [1] BloombergGPT: https://arxiv.org/pdf/2303.17564.pdf [2] Llama 2: https://ai.meta.com/research/publicat... [3] LLM Energy Costs: https://www.statista.com/statistics/1... [4] arXiv:2005.14165 [cs.CL] [5] Falcon 180b Blog: https://huggingface.co/blog/falcon-180b [6] arXiv:2101.00027 [cs.CL] [7] Alpaca Repo: https://github.com/gururise/AlpacaDat... [8] arXiv:2303.18223 [cs.CL] [9] arXiv:2112.11446 [cs.CL] [10] arXiv:1508.07909 [cs.CL] [11] SentencePience: https://github.com/google/sentencepie... [12] Tokenizers Doc: https://huggingface.co/docs/tokenizer... [13] arXiv:1706.03762 [cs.CL] [14] Andrej Karpathy Lecture: • Let's build GPT: from scratch, in cod... [15] Hugging Face NLP Course: https://huggingface.co/learn/nlp-cour... [16] arXiv:1810.04805 [cs.CL] [17] arXiv:1910.13461 [cs.CL] [18] arXiv:1603.05027 [cs.CV] [19] arXiv:1607.06450 [stat.ML] [20] arXiv:1803.02155 [cs.CL] [21] arXiv:2203.15556 [cs.CL] [22] Trained with Mixed Precision Nvidia: https://docs.nvidia.com/deeplearning/... [23] DeepSpeed Doc: https://www.deepspeed.ai/training/ [24] https://paperswithcode.com/method/wei... [25] https://towardsdatascience.com/what-i... [26] arXiv:2001.08361 [cs.LG] [27] arXiv:1803.05457 [cs.AI] [28] arXiv:1905.07830 [cs.CL] [29] arXiv:2009.03300 [cs.CY] [30] arXiv:2109.07958 [cs.CL] [31] https://huggingface.co/blog/evaluatin... [32] https://www.cs.toronto.edu/~hinton/ab... -- Homepage: https://shawhintalebi.com/ Book a call: https://calendly.com/shawhintalebi Intro - 0:00 How much does it cost? - 1:30 4 Key Steps - 3:55 Step 1: Data Curation - 4:19 1.1: Data Sources - 5:31 1.2: Data Diversity - 7:45 1.3: Data Preparation - 9:06 Step 2: Model Architecture (Transformers) - 13:17 2.1: 3 Types of Transformers - 15:13 2.2: Other Design Choices - 18:27 2.3: How big do I make it? - 22:45 Step 3: Training at Scale - 24:20 3.1: Training Stability - 26:52 3.2: Hyperparameters - 28:06 Step 4: Evaluation - 29:14 4.1: Multiple-choice Tasks - 30:22 4.2: Open-ended Tasks - 32:59 What's next? - 34:31