У нас вы можете посмотреть бесплатно Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
This paper introduces a significant shift in artificial intelligence, moving from models that simply *"Think about Images"* to those that can truly *"Think with Images"**. Previously, AI models treated visual information as a static, initial input, converting it into text for reasoning, which often led to a **semantic gap* and limitations in complex tasks. The new *"Thinking with Images" paradigm* transforms vision into a **dynamic, manipulable cognitive workspace**, allowing models to use visual information as intermediate steps in their thought processes, similar to a human using a sketchpad. This evolution unfolds across three key stages: **Stage 1: Tool-Driven Visual Exploration**, where models command a fixed set of external visual analysis tools; **Stage 2: Programmatic Visual Manipulation**, where models generate custom code to perform tailored visual operations; and **Stage 3: Intrinsic Visual Imagination**, the most advanced stage, where models internally generate new visual thoughts or simulations within a closed cognitive loop. While this new approach enables more robust and human-like visual cognition, it faces challenges such as high computational costs, potential error propagation from dense visual information, and the need for new architectural designs to bridge the gap between language and pixels. The paper provides a comprehensive overview of these stages, their methods, relevant evaluations, and applications, aiming to guide future research towards more powerful multimodal AI. https://arxiv.org/pdf/2506.23918