У нас вы можете посмотреть бесплатно DeepSeek-OCR: Compressing Long-Contexts via Optical 2D Mapping или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
In this episode of SciPulse, we dive into a groundbreaking technical report from DeepSeek-AI: "DeepSeek-OCR: Contexts Optical Compression." As Large Language Models (LLMs) face increasing computational hurdles when processing ultra-long textual content, researchers are looking for innovative ways to enhance efficiency. This paper explores a fascinating solution: leveraging visual modality as a high-ratio compression medium for text. We break down how DeepSeek-OCR utilizes its novel DeepEncoder architecture to achieve significant token reduction while maintaining high precision. Whether you are a student, a researcher, or a tech enthusiast, this video provides a deep dive into the next frontier of Vision-Language Models (VLMs) and the potential for theoretically unlimited context architectures. Key Topics Covered: • The Concept of Optical Compression: Why a picture might literally be worth a thousand words in the world of LLMs. • DeepEncoder Architecture: A look at the 380M parameter engine that uses window attention and global attention components connected by a 16× convolutional compressor to minimize vision tokens. • Performance & Benchmarks: How the model achieves 97% decoding precision at 10× compression and outperforms much larger models like MinerU2.0 on OmniDocBench. • Deep Parsing Capabilities: The model’s ability to parse complex documents, including charts, chemical formulas, plane geometry, and natural images. • Simulating Human Forgetting: How progressive downsizing of images can mirror the natural decay of human memory for older contexts. Main Insights & Significance: The paper demonstrates that compact language models can effectively learn to decode highly compressed visual representations. This "vision-text compression" paradigm offers a promising path toward addressing long-context challenges without the quadratic scaling costs usually associated with text-only processing. 🎧 Listen to the audio discussion on Spotify (perfect for your commute): https://open.spotify.com/episode/1xmm... Educational Disclaimer: This video is an educational summary designed to highlight key concepts from the research paper. It is not a replacement for reading the original technical report, which contains detailed methodologies and experimental data. Original Research Paper: https://www.arxiv.org/pdf/2510.18234 #SciPulse #DeepSeek #AI #OCR #MachineLearning #DeepLearning #LLM #ComputerVision #ResearchPaper #ArtificialIntelligence #DataCompression #TechExplained #VLM #DeepEncoder