У нас вы можете посмотреть бесплатно Look at Your Data: Debugging, Evaluating, and Iterating on Generative AI Systems или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Everyone wants to build generative AI products that deliver real business value. But here’s the catch: most systems fall short because teams don’t know where to start when things go wrong. Debugging, evaluation, and learning how to look at your data are essential to iterating and improving your systems effectively. In this live-streamed fireside chat, Hugo Bowne-Anderson and Hamel Husain will explore how to: Use error analysis to identify the biggest pain points in your LLM workflows. Build evaluation frameworks that connect directly to your product goals. Develop a curiosity-driven approach to looking at data and traces, so you can iterate faster. Understand why debugging is the cornerstone of building reliable, scalable generative AI systems. About Hamel Husain Hamel Husain (Parlance Labs, ex-Github, Airbnb, DataRobot) has worked at the intersection of data science and AI engineering, helping teams scale LLM-powered systems through robust error analysis, evaluation, and debugging practices. His pragmatic approach emphasizes focusing on what matters most to deliver better outcomes. Why Attend? This is a conversation for anyone who has felt stuck trying to improve an LLM application. Whether you’re debugging multi-turn conversations, agentic systems, or building evaluation frameworks, this session will give you the practical perspectives needed to iterate and build systems that deliver. This conversation was originally planned to be part of Hugo and Stefan Krawczyk’s Building LLM Applications for Data Scientists and Software Engineer’s course but we had so many requests to make it public, we’ll be live-streaming it to the world! https://maven.com/s/course/d56067f338 00:00 Introduction and Welcome 02:41 Introducing Hamel Hussain 03:49 The Importance of Data Analysis for LLM-powered software 04:55 Systematic Data Examination 05:31 Practical Example: LLM Summarizing Hacker News Articles 07:16 LLM Error Analysis and Application Improvement 09:09 Andrew Ng's Error Analysis Example 10:24 Categorizing and Analyzing Errors / Failure Modes 22:40 Using Pivot Tables for Error Analysis 27:54 Iterating on Model Prompts 32:48 Tools and Techniques for Data Viewing 43:31 Misleading Dashboards for LLMs 44:43 Client Mistakes and Critique Shadowing 45:13 Importance of Error Analysis in LLMs 46:20 Tools and Techniques for Error Analysis 47:52 Evaluating Meta Prompts and Synthetic Data 52:47 Data Flywheels and Fine-Tuning 01:01:11 Challenges in Error Analysis and Business Metrics 01:05:48 Error Analysis in Multi-Turn Conversations 01:13:03 Clustering and Faceting in Error Analysis 01:23:46 Conclusion and Future Directions