У нас вы можете посмотреть бесплатно Measuring LLM Reasoning Effort via Deep-Thinking Ratio или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Introduces a novel metric called the Deep-Thinking Ratio (DTR) to evaluate the quality of reasoning in large language models. Research indicates that simple output length is an unreliable proxy for accuracy, as longer responses often reflect overthinking or error amplification rather than effective problem-solving. Instead of measuring surface-level features, DTR tracks how token predictions stabilize across a model’s internal layers, identifying "deep-thinking tokens" that require sustained computational revision in later layers. This approach demonstrates a significantly higher correlation with task accuracy across rigorous mathematical and scientific benchmarks compared to traditional confidence-based or length-based methods. Furthermore, the authors present Think@n, a strategy that utilizes DTR to select the highest-quality responses during parallel sampling. This method effectively matches the performance of standard consensus techniques while reducing total computational costs by approximately half.