У нас вы можете посмотреть бесплатно DeepSeek-OCR: Contexts Optical Compression или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
The provided text is an excerpt from the technical paper "DeepSeek-OCR: Contexts Optical Compression," which introduces a novel vision-language model (VLM) for Optical Character Recognition (OCR) developed by DeepSeek-AI. This model explores the concept of optical 2D mapping as a method for efficiently compressing long textual contexts, tackling the computational challenges faced by Large Language Models (LLMs) when processing long sequences. DeepSeek-OCR consists of the DeepEncoder and a DeepSeek3B-MoE decoder, engineered to maintain high precision even at high compression ratios, such as achieving 97% accuracy when text tokens are compressed by 10× into vision tokens. The document presents quantitative results demonstrating the model's state-of-the-art performance on benchmarks like OmniDocBench while utilizing significantly fewer vision tokens than competing models, establishing the feasibility of using visual modality for effective text compression. Furthermore, the paper discusses DeepSeek-OCR's practical utility, including its ability to generate large volumes of training data and its potential to simulate memory forgetting mechanisms in LLMs through progressive image downsampling.