У нас вы можете посмотреть бесплатно Measuring Exponential Trends Rising (in AI) — Joel Becker, METR или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Joel Becker explains METR’s focus on Model Evaluation and Threat Research to assess whether AI could pose enormous or catastrophic risks. Becker discusses METR’s publicized work such as the time horizon chart (task difficulty measured in human-time at 50% reliability), how tasks are selected and constrained (economic relevance, auto-grading, non-messy scope), and why time horizon is often misinterpreted as how long agents run. They cover Opus 4.5’s perceived jump, challenges redoing developer productivity RCTs as workflows and adoption change, and why current models aren’t yet catastrophically dangerous while discontinuous capability gains remain possible if R&D loops fully automate. Becker also summarizes research linking compute growth slowdowns to slower capability progress, describes his Manifold trading story driven by a charity market he could influence, notes mixed social value of prediction markets, and previews METR’s 2026 plans, safeguards work, and hiring. 00:00 What METR Does 00:39 Podcast Intro 02:53 Threat Models Shift 03:33 Time Horizon Origin 04:56 Choosing Eval Tasks 06:25 Messy Real Work 08:10 HCAST And RE Bench 09:13 Human Time Misread 11:37 Opus 4.5 Surprise 14:27 Redoing Uplift RCTs 18:52 Measuring Productivity 20:55 Why Not Dangerous Yet 22:22 Capability Explosion 26:23 Benchmarks Miss Tail 28:08 Beyond One Number 29:50 Compute Slows Progress 30:47 Algorithms Need Compute 32:45 Lab Spend and Visibility 34:57 Cluster Timelines and Shipping 36:44 Prediction Markets and Models 38:10 Manifold Trading Story 39:52 Ethics and Insider Info 43:04 Beyond Benchmarks Evals 48:29 Harnesses and Scaffolding 51:39 METER Roadmap and Hiring 54:24 Karaoke and Human Voice 55:53 Closing Thoughts