У нас вы можете посмотреть бесплатно Release Notes: Gemini's multimodality или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Ani Baddepudi, Gemini Model Behavior Product Lead, joins host Logan Kilpatrick for a deep dive into Gemini's multimodal capabilities. Their conversation explores why Gemini was built as a natively multimodal model from day one, the future of proactive AI assistants, and how we are moving towards a world where "everything is vision." Learn about the differences between video and image understanding and token representations, higher FPS video sampling, and more. Chapters: 0:00 - Intro 1:12 - Why Gemini is natively multimodal 2:23 - The technology behind multimodal models 5:15 - Video understanding with Gemini 2.5 9:25 - Deciding what to build next 13:23 - Building new product experiences with multimodal AI 17:15 - The vision for proactive assistants 24:13 - Improving video usability with variable FPS and frame tokenization 27:35 - What’s next for Gemini’s multimodal development 31:47 - Deep dive on Gemini’s document understanding capabilities 37:56 - The teamwork and collaboration behind Gemini 40:56 - What’s next with model behavior Watch more Release Notes → https://goo.gle/4njokfg Subscribe to Google for Developers → https://goo.gle/developers Speaker: Logan Kilpatrick, Anirudh Baddepudi Products Mentioned: Google AI, Gemini