У нас вы можете посмотреть бесплатно Genie: Generative Interactive Environments with Ashley Edwards - 696 или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Today, we're joined by Ashley Edwards, a member of technical staff at Runway, to discuss Genie: Generative Interactive Environments - https://arxiv.org/abs/2402.15391, a system for creating ‘playable’ video environments for training deep reinforcement learning (RL) agents at scale in a completely unsupervised manner. We explore the motivations behind Genie, the challenges of data acquisition for RL, and Genie’s capability to learn world models from videos without explicit action data, enabling seamless interaction and frame prediction. Ashley walks us through Genie’s core components—the latent action model, video tokenizer, and dynamics model—and explains how these elements collaborate to predict future frames in video sequences. We discuss the model architecture, training strategies, benchmarks used, as well as the application of spatiotemporal transformers and the MaskGIT techniques used for efficient token prediction and representation. Finally, we touched on Genie’s practical implications, its comparison to other video generation models like “Sora,” and potential future directions in video generation and diffusion models. 🎧 / 🎥 Listen or watch the full episode on our page: https://twimlai.com/go/696. 🔔 Subscribe to our channel for more great content just like this: https://youtube.com/twimlai?sub_confi... 🗣️ CONNECT WITH US! =============================== Subscribe to the TWIML AI Podcast: https://twimlai.com/podcast/twimlai/ Follow us on Twitter: / twimlai Follow us on LinkedIn: / twimlai Join our Slack Community: https://twimlai.com/community/ Subscribe to our newsletter: https://twimlai.com/newsletter/ Want to get in touch? Send us a message: https://twimlai.com/contact/ 📖 CHAPTERS =============================== 00:00 - Introduction 3:06 - Motivation 4:10 - Data sources 4:53 - Results 6:37 - Major components 9:41 - Spatiotemporal transformers 12:40 - Latent action model 17:19 - Video tokenizer 21:28 - Dynamics model 38:19 - Genie vs other video generation models 40:04 - Implications of Genie 41:06 - Gaps 44:10 - Future directions 🔗 LINKS & RESOURCES =============================== Genie: Generative Interactive Environments paper - https://arxiv.org/abs/2402.15391 MaskGIT - https://github.com/google-research/ma... Sora | Open AI - https://openai.com/index/sora/ Spatial-Temporal Transformer Networks for Traffic Flow Forecasting - https://arxiv.org/abs/2001.02908 Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - https://twimlai.com/podcast/twimlai/m... V-JEPA, AI Reasoning from a Non-Generative Architecture with Mido Assran - https://twimlai.com/podcast/twimlai/v... 📸 Camera: https://amzn.to/3TQ3zsg 🎙️Microphone: https://amzn.to/3t5zXeV 🚦Lights: https://amzn.to/3TQlX49 🎛️ Audio Interface: https://amzn.to/3TVFAIq 🎚️ Stream Deck: https://amzn.to/3zzm7F5