У нас вы можете посмотреть бесплатно Diego Fajardo - No single test is enough или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
How do we know a model is actually ready for high-stakes use? In healthcare and life sciences, that question gets complicated fast. A model can look strong on one task, weak on another, and still surprise you when the stakes become real. That makes the real problem bigger than evaluation alone. It is about understanding what a model can do, where it breaks, what different kinds of evidence really tell us, and how we can move from isolated results to a more complete picture of readiness. This talk explores that broader challenge, from benchmarks and expert review to interpretability and more realistic task-based testing, and asks what it would take to evaluate models in a way that actually matches how they will be used. Diego Fajardo leads evaluation work at Lumos, a startup focused on improving the real-world performance and safety of AI models and agents in healthcare and life sciences. His work spans benchmark design, expert evaluation, use of LLMs as judges, and interactive testing approaches to better understand model performance in complex settings. A major part of his current work focuses on AI patients: synthetic interactions designed to make benchmarking more realistic and scalable for patient-facing agents. This session is brought to you by the Cohere Labs Open Science Community - a space where ML researchers, engineers, linguists, social scientists, and lifelong learners connect and collaborate with each other. We'd like to extend a special thank you to Alif Munim and Abrar Frahman, Leads of our AI Safety and Alignment group for their dedication in organizing this event. If you’re interested in sharing your work, we welcome you to join us! Simply fill out the form at https://forms.gle/ALND9i6KouEEpCnz6 to express your interest in becoming a speaker. Join the Cohere Labs Open Science Community to see a full list of upcoming events (https://tinyurl.com/CohereLabsCommuni....