У нас вы можете посмотреть бесплатно BLT (Byte Latent Transformer) in 3 minutes! или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Are tokenizers the "Achilles' heel" of Large Language Models? In this video, we break down BLT (Byte Latent Transformer), a new architecture that ditches fixed tokenization for a more intelligent, byte-level approach. Most LLMs use Byte Pair Encoding (BPE), which merges text based on frequency, not meaning. BLT flips the script by using Entropy-Based Patching. It allocates more compute where the text is complex and compresses predictable data, leading to better reasoning and efficiency. What we cover in 3 minutes: ✅ The Tokenization Problem: Why BPE is "terrible for reasoning" and frequency-based biases. ✅ Entropy-Based Patching: How BLT dynamically groups bytes based on uncertainty. ✅ The 3-Transformer Architecture: 1. Local Byte Encoder: Turning raw bytes into dense representations. 2. Latent Transformer: The heavy-lifting engine that reasons in "patch space." 3. Local Byte Decoder: Reconstructing bytes via cross-attention. ✅ Hash-based N-gram Embeddings: How BLT gains morphological structure without a vocabulary. Chapters: [00:00] The Flaw in Modern Tokenizers (BPE) [00:32] Introducing BLT: Compute Following Entropy [01:14] How Entropy Estimates Patch Boundaries [01:48] Architecture Part 1: The Local Byte Encoder [02:09] Architecture Part 2: The Latent Transformer [02:23] Architecture Part 3: The Local Byte Decoder [02:57] Using Hash-based N-gram Embeddings Paper: Byte Latent Transformer: Patches Scale Better Than Tokens Authors: Meta AI (FAIR) #MachineLearning #LLMs #Transformers #AIResearch #ByteLevel #Tokenization #DeepLearning #ScalingLaws #BLT