У нас вы можете посмотреть бесплатно Generative AI is WRONG? 😱 VL-JEPA Explained (Yann LeCun's Vision) | VL-JEPA Explained: 2.8x Faster или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Can AI truly "understand" without just predicting the next word? Meta's new research says YES. In this video, we break down VL-JEPA (Vision-Language Joint Embedding Predictive Architecture), a groundbreaking new research paper from Meta FAIR (Yann LeCun’s team). Unlike standard Vision Language Models (VLMs) like GPT-4V or Llama Vision which generate text token-by-token (slow and expensive), VL-JEPA predicts Embeddings (Meaning) directly. This shift allows for real-time processing, massive efficiency gains, and a smarter way for AI to perceive the world—essential for future robotics and AR tech. 📄 Key Concepts Covered: Generative vs. Predictive: Why guessing the "next token" is inefficient for vision. Embeddings Explained: How AI captures the meaning of "Darkness" without needing the word "Dark." Selective Decoding: How this model saves battery by staying silent until something actually changes. Performance: Achieving 2.85x faster decoding with 50% fewer parameters! ⏱️ Timestamps: 0:00 - The Problem with "Generative" Vision AI 0:45 - What is VL-JEPA? (Generative vs Predictive) 2:30 - How it Works: X-Encoder & Predictor Explained 4:00 - The Game Changer: Selective Decoding 5:30 - Is this the path to AGI? 🔗 References & Links: Paper Title: VL-JEPA: Joint Embedding Predictive Architecture for Vision-Language Authors: Meta FAIR (Shukor, Moutakanni, et al.) Read the Paper: https://arxiv.org/pdf/2512.10942 #VLJEPA #MetaAI #YannLeCun #ArtificialIntelligence #ComputerVision #MachineLearning #AGI #TechNews #AIResearch