Скачать с ютуб видео Can LLMs Truly Build a Complete Project Repository from Scratch? (Chinese Talk)

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Скачать видео с ютуб по ссылке или смотреть без блокировок на сайте: Can LLMs Truly Build a Complete Project Repository from Scratch? (Chinese Talk) в качестве 4k

У нас вы можете посмотреть бесплатно Can LLMs Truly Build a Complete Project Repository from Scratch? (Chinese Talk) или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:

Информация по загрузке:

Скачать mp3 с ютуба отдельным файлом. Бесплатный рингтон Can LLMs Truly Build a Complete Project Repository from Scratch? (Chinese Talk) в формате MP3:

Если кнопки скачивания не загрузились НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу страницы.
Спасибо за использование сервиса ClipSaver.ru

Can LLMs Truly Build a Complete Project Repository from Scratch? (Chinese Talk)

Can LLMs Truly Build a Complete Project Repository from Scratch? Findings from Long-Horizon Generation Evaluation Recent progress in code generation has demonstrated strong performance on short-horizon tasks such as function synthesis and local code completion. However, whether large language models can sustain coherent planning, architectural consistency, and execution reliability across the full lifecycle of building a real project repository remains an open question. This talk presents findings from NL2Repo-Bench, a long-horizon evaluation benchmark that challenges models to construct a complete, runnable Python repository from scratch using only a natural language specification and an empty workspace. Results show that even with a perfectly designed prompt, current models frequently fail under long-horizon settings, exhibiting logical collapse, fragile cross-file dependencies, and insufficient global planning. The study highlights long-horizon reasoning as a critical bottleneck for autonomous coding agents. Paper NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agentshttps://arxiv.org/pdf/2512.12730 Speaker Shengda LongMaster’s Student, Peking University Host Ruiwen ZhouPhD Student, National University of Singapore

Comments