У нас вы можете посмотреть бесплатно Can LLMs Truly Build a Complete Project Repository from Scratch? (Chinese Talk) или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Can LLMs Truly Build a Complete Project Repository from Scratch? Findings from Long-Horizon Generation Evaluation Recent progress in code generation has demonstrated strong performance on short-horizon tasks such as function synthesis and local code completion. However, whether large language models can sustain coherent planning, architectural consistency, and execution reliability across the full lifecycle of building a real project repository remains an open question. This talk presents findings from NL2Repo-Bench, a long-horizon evaluation benchmark that challenges models to construct a complete, runnable Python repository from scratch using only a natural language specification and an empty workspace. Results show that even with a perfectly designed prompt, current models frequently fail under long-horizon settings, exhibiting logical collapse, fragile cross-file dependencies, and insufficient global planning. The study highlights long-horizon reasoning as a critical bottleneck for autonomous coding agents. Paper NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agentshttps://arxiv.org/pdf/2512.12730 Speaker Shengda LongMaster’s Student, Peking University Host Ruiwen ZhouPhD Student, National University of Singapore