У нас вы можете посмотреть бесплатно Modern Information Retrieval Evaluation In The RAG Era или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Join the AI Evals Course starting March 16, 2026: https://maven.com/parlance-labs/evals... . Modern IR Evaluation in the RAG Era w/ Nandan Thakur. Learn about future directions in RAG evaluation, including the shift from traditional search setups to modern RAG systems. Nandan discusses zero-shot evaluation, diversity and grounding metrics, and the challenges posed by modern retrieval scenarios for RAG. 00:00 Introduction to Modern Evaluation in the RAG Era 00:14 Speaker Background and Experience 01:09 Overview of IR Evaluation 01:45 History of Traditional IR Evaluation 03:54 The Cranfield Paradigm 06:00 Examples of Test Collections 07:38 Introduction to Zero Shot Evaluation 10:09 Challenges with Zero Shot Evaluation 14:43 Limitations of Current Test Collections 17:28 The Evolution of IR Evaluation in the RAG Era 18:20 Traditional Search vs. RAG Systems 21:08 Metrics for Traditional Search and RAG Systems 24:50 Motivation for New IR Benchmarks 26:16 Building Fresh Stack 28:18 Fresh Stack Pipeline 31:26 Evaluation of Fresh Stack 32:40 Understanding Coverage of Unique Nuggets 32:56 Relevance and Recall in Document Retrieval 33:19 Introduction to Fresh Stack Results 33:31 Challenges with Current Retrieval Techniques 34:14 Fusion Models and Their Performance 34:40 Highlighting the Performance Gap 34:55 Maintaining the Fresh Stack Leaderboard 35:10 Evaluating New Models on Fresh Stack 35:35 Easy Script for Evaluating Retrieval Models 36:01 Key Takeaways from the Presentation 36:24 Discussion on Search and Retrieval Metrics 36:50 Fresh Stack as a Benchmark 37:26 Q&A: Generalization of Fresh Stack 39:23 Q&A: Domain-Specific Retrieval Evaluations 44:32 Q&A: Leaderboard and Model Performance 48:55 Q&A: Future of Retrieval and Evaluation