У нас вы можете посмотреть бесплатно Visually explaining Byte Latent Transformers - LLMs just got a massive breakthrough! или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
In this video, we discuss Meta's latest paper on the Byte Latent Transformers (BLT) model from the paper Byte Latent Transformers - Patches scale better than Tokens. Quite literally, we go over each word in that sentence, and what they mean. Personally, I think dynamic compute allocation is a huge deal and this feels like a pretty exciting research direction for LLMs going forward. I tried to present visually engaging material that explains the architectural design behind various ideas in the paper. Paper link: Paper - https://arxiv.org/abs/2412.09871 #deeplearning #ai Join our Patreon to support the channel! Your support keeps the channel going! Members also get access to all the code, slides, documents, animations produced in all my videos including this one. Files are usually shared within a day of upload. Patreon link: / neuralbreakdownwithavb Direct link for the material used in this video: / byte-latent-blt-118825972 Related videos you may enjoy: Transformers playlist: • Attention to Transformers from zero to hero! The History of Attention: • Turns out Attention wasn't all we needed -... Coding Language Models from scratch: • From Attention to Generative Language Mode... Latent Space Models: • Visualizing the Latent Space: This video w... Advanced Latent Space LLMs: • If LLMs are text models, how do they gener... History of NLP: • 10 years of NLP history explained in 50 co... Timestamps: 0:00 - Intro 1:21 - Intro to Transformers 3:39 - Subword Tokenizers 4:48 - Embeddings 7:10 - How does vocab size impact Transformer FLOPs? 11:15 - Byte Encodings 12:33 - Pros and Cons of Byte Tokens 15:05 - Patches 17:00 - Entropy 19:34 - Entropy model 23:40 - Dynamically Allocate Compute 25:11 - Latent Space 27:15 - BLT Architecture 29:30 - Local Encoder 34:06 - Latent Transformer and Local Decoder in BLT 36:08 - Outro