У нас вы можете посмотреть бесплатно How Continual Learning May Be Solved in AI (TITANS Explained) или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
TITANS introduces continual learning to AI models, enabling neural networks to learn during inference through test-time training. Unlike frozen transformers that stop learning after training, TITANS runs gradient descent in real-time using a neural long-term memory module. This video explains how the TITANS architecture (from Google Research) achieves 98% accuracy on long-context benchmarks with 10x fewer parameters than competitors. We cover the memory system, the surprise-based learning signal, weight decay for forgetting, and the MAC/MAG/MAL variants. -------------- TIMESTAMPS 0:00 - The Memory Crisis 0:22 - Attention's Quadratic Scaling Problem 0:44 - The Linear Shortcut (Mamba, RWKV) 1:12 - Memory as the Missing Piece 1:40 - The Foundation: Associative Memory 2:03 - Neural Long-term Memory (Test-Time Learning) 2:32 - The Surprise Metric 3:16 - Learning to Forget 3:45 - The TITANS Architecture 4:24 - MAC, MAG, MAL: Three Memory Variants 5:02 - Benchmark Results 5:24 - The Future of Learning Models ---------------------- REFERENCES TITANS: Learning to Memorize at Test Time Behrouz, Zhong, Mirrokni (Google Research, 2024) - https://arxiv.org/abs/2501.00663 Transformers are SSMs Dao, Gu (2024) - https://arxiv.org/abs/2405.21060 Mamba: Linear-Time Sequence Modeling with Selective State Spaces Gu, Dao (2023) - https://arxiv.org/abs/2312.00752 -------------------------- KEY CONCEPTS Continual learning and test-time training Neural long-term memory architecture Associative memory with gradient-based updates Surprise metric as self-supervised signal Memory as Context (MAC) vs transformers State space models (Mamba, RWKV, RetNet) #titans #continuallearning #machinelearning #ai #transformers #neuralnetworks #deeplearning #mamba #googleresearch #memory #neuralmemory #gradientdescent