У нас вы можете посмотреть бесплатно Multimodal AI from First Principles - Neural Nets that can see, hear, AND write. или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Generative Large Language Models like OpenAI's GPT-4, Google's PaLM 2, and Discriminative models like ImageBind are models released in 2023 that combine visual and textual input to perform multi-modal tasks. Multimodal modeling combines multiple modalities to train neural networks - images, text, audio, etc empowering ML models to perform amazing multimodal tasks like text-image retrieval, multimodal vector arithmetic, visual question answering, and language modelling. To support the channel and access the Word documents/slides used in this video, consider JOINING the channel on Youtube or Patreon. Members get access to scripts, slides, animations, and illustrations for most of the videos on my channel! Patreon - / neuralbreakdownwithavb Follow on Twitter: @neural_avb In this video, I covered the essential published techniques for Multimodal Modelling and so many amazing results of the past few years that have left my jaws on the floor. Hope you enjoy it! Watch how Multimodal models generate images: • If LLMs are text models, how do they gener... #deeplearning #languagemodel #gpt #computervision Papers references in this video: Unifying Visual-Semantic Embeddings: https://arxiv.org/pdf/1411.2539.pdf CLIP: https://arxiv.org/abs/2102.02779 ImageBInd: https://arxiv.org/abs/2305.05665 BLIP: https://arxiv.org/abs/2201.12086 HERO: https://arxiv.org/pdf/2005.00200.pdf VL-T5: https://arxiv.org/pdf/2102.02779.pdf OFA: https://arxiv.org/abs/2202.03052 SimVLM: https://arxiv.org/abs/2108.10904 Frozen: https://arxiv.org/abs/2106.13884 Flamingo: https://arxiv.org/abs/2204.14198 MiniGPT4: https://arxiv.org/abs/2304.10592 Kosmos-1: https://arxiv.org/abs/2302.14045 PaLM-E: https://arxiv.org/abs/2303.03378 Timestamps: 0:00 - Intro 02:55 - Basics 05:05 - Contrastive Learning 07:54 - Masked Visual Language Models 10:20 - Unified Models 13:41 - Generative LLMs