У нас вы можете посмотреть бесплатно Dr. Zero: Self-Evolving Search Agents Without Training Data или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
🚀 Dive into the Future of Self-Evolving AI Agents! https://www.emergent-behaviors.com/dr... In this video, we explore the innovative research paper "Dr. Zero: Self-Evolving Search Agents Without Training Data" authored by experts from Meta Superintelligence Labs and the University of Illinois Urbana-Champaign. Discover how self-evolving agents can learn complex reasoning tasks without the need for labeled training data, effectively tackling the current data bottleneck in AI. Join us as we break down the unique architecture of Dr. Zero, highlighting the roles of the Proposer and Solver agents and the ingenious methods they use to evolve their capabilities. You'll learn about the challenges of traditional reinforcement learning in search and how innovative strategies like Hop-Grouped Relative Policy Optimization (HRPO) can lead to significant improvements in efficiency and performance. 📌 What You'll Learn: • 🔍 The concept of self-evolution in AI agents without training data • 🧠 The roles of Proposer and Solver in the Dr. Zero architecture • 📈 How HRPO optimizes learning and reduces computational costs • 🔄 The significance of "hops" in measuring question complexity • 🏆 Results showing competitiveness against supervised learning baselines • 🔮 Future implications for self-taught AI systems ⏳ Timestamps: 0:00 Introduction to Dr. Zero 0:47 The data bottleneck: we are running out of internet 1:27 The infinite loop of self-evolution (and how it collapses) 2:14 Meet the cast: a shared brain, two roles 2:47 Why search is harder than math: verification is messy 3:35 Measuring difficulty in hops: from Paris to multi-hop nightmares 4:11 Why standard RL (GRPO) gets too expensive for search 4:56 HRPO: group by hop count to compare apples to apples 5:38 Efficiency is king: one question per prompt and lower variance 6:26 The Goldilocks reward: target the solver's learning zone 7:15 Training montage: the agents build their own curriculum 8:01 Results: no labels, yet competitive with supervised baselines 8:42 Not all zeros are created equal: why Dr. Zero pulls ahead 9:24 The future is self-taught (mostly): takeaways and limits 10:19 TL;DR and code: what to remember about Dr. Zero Dr. Zero: Self-Evolving Search Agents Without Training Data https://arxiv.org/pdf/2601.07055 Zhenrui Yue, Kartikeya Upasani, Xianjun Yang, Suyu Ge, Shaoliang Nie1, Yuning Mao, Zhe Liu, Dong Wang Meta Superintelligence Labs, University of Illinois Urbana-Champaign #AI #MachineLearning #SelfEvolvingAI #ReinforcementLearning #DataFree #NaturalLanguageProcessing #SearchAgents #ArtificialIntelligence #DrZero #HRPO #ResearchPaper #MetaAI #UIUC #Innovation #TechForGood