У нас вы можете посмотреть бесплатно 🚀 How YouTube Scales LLM-Based Recommendations with "STATIC" или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Ever wonder how Generative Retrieval (GR) can recommend the perfect video out of billions in milliseconds? While LLMs are powerful, they often struggle with a "validity gap"—sometimes hallucinating item IDs that don’t exist or are out of stock. The standard fix is constrained decoding using a "trie" (a prefix tree), but traditional tries are slow on hardware accelerators like GPUs and TPUs because they rely on "pointer-chasing" and irregular memory access. Introducing STATIC: Sparse Transition Matrix-Accelerated Trie Index Our team developed a new framework that reimagines constrained decoding as vectorized sparse matrix operations. By flattening the prefix tree into a static Compressed Sparse Row (CSR) matrix, we’ve unlocked massive efficiency gains. Key Highlights: Massive Speedups: STATIC is 8.3x faster than standard CPU trie implementations and up to 102x faster than hardware-accelerated binary search baselines. Ultra-Low Latency: It adds only 0.033 ms of overhead per decoding step. Real-World Impact: We have successfully deployed STATIC on YouTube, a platform serving billions of users, to handle a vocabulary of 20 million fresh items. Better Cold-Starts: Beyond speed, STATIC significantly improves cold-start performance for new items. By shifting from "pointer-chasing" to "accelerator-native" linear algebra, we are bridging the gap between classic data structures and modern deep learning compilers like XLA. 🔗 Check out our code on GitHub: https://github.com/youtube/static-con... #MachineLearning #LLM #YouTubeEngineering #GenerativeRetrieval #AI #SearchAndRecommenderSystems