У нас вы можете посмотреть бесплатно Is It Time to Rethink LLM Pre-Training? [Aditi Raghunathan] - 747 или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Today, we're joined by Aditi Raghunathan, assistant professor at Carnegie Mellon University, to discuss the limitations of LLMs and how we can build more adaptable and creative models. We dig into her ICML 2025 Outstanding Paper Award winner, “Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction,” which examines why LLMs struggle with generating truly novel ideas. We dig into the "Roll the dice" approach, which encourages structured exploration by injecting randomness at the start of generation, and the "Look before you leap" concept, which trains models to take "leaps of thought" using alternative objectives to create more diverse and structured outputs. We also discuss Aditi’s papers exploring the counterintuitive phenomenon of "catastrophic overtraining," where training models on more data improves benchmark performance but degrades their ability to be fine-tuned for new tasks, and dig into her lab's work on creating more controllable and reliable models, including the concept of "memorization sinks," an architectural approach to isolate and enable the targeted unlearning of specific information. 🗒️ For the full list of resources for this episode, visit the show notes page: https://twimlai.com/go/747. 🔔 Subscribe to our channel for more great content just like this: https://youtube.com/twimlai?sub_confi... 🗣️ CONNECT WITH US! =============================== Subscribe to the TWIML AI Podcast: https://twimlai.com/podcast/twimlai/ Follow us on Twitter: / twimlai Follow us on LinkedIn: / twimlai Join our Slack Community: https://twimlai.com/community/ Subscribe to our newsletter: https://twimlai.com/newsletter/ Want to get in touch? Send us a message: https://twimlai.com/contact/ 📖 CHAPTERS =============================== 00:00 - Introduction 4:30 - Gap between benchmark performance and real-world user experience 6:19 - Fine-tuning and model adaptability 10:16 - Token to parameter ratio 14:38 - Overtrained Language Models Are Harder to Fine-Tune paper 16:17 - Base model selection 17:55 - Unlearning 22:04 - Memorization Sinks: Isolating Memorization during LLM Training paper 29:05 - Role of memory in LLMs 30:53 - Going beyond the creative limits of next-token prediction paper 34:49 - Creativity 37:12 - Exploratory 38:20 - Difference of creativity in LLMs 44:22 - Look before you leap part in the paper 46:36 - Roll the dice part 52:43 - Compatibility with RL training 54:00 - Future directions 🔗 LINKS & RESOURCES =============================== Aditi Raghunathan’s Group @ ICML 2025 - https://www.cs.cmu.edu/~aditirag/icml... Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction - https://arxiv.org/pdf/2504.15266 Overtrained Language Models Are Harder to Fine-Tune - https://arxiv.org/pdf/2503.19206 Memorization Sinks: Isolating Memorization during LLM Training - https://arxiv.org/pdf/2507.09937 Exploring the “Biology” of LLMs with Circuit Tracing with Emmanuel Ameisen - #727 - https://twimlai.com/podcast/twimlai/e... 📸 Camera: https://amzn.to/3TQ3zsg 🎙️Microphone: https://amzn.to/3t5zXeV 🚦Lights: https://amzn.to/3TQlX49 🎛️ Audio Interface: https://amzn.to/3TVFAIq 🎚️ Stream Deck: https://amzn.to/3zzm7F5