У нас вы можете посмотреть бесплатно Jan 2022: OpenAI Made AI Obedient. That Was the Mistake. | Prompt Injection History или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Join my newsletter The AI Archive - Exploring AI security, agents, and automation: https://the-ai-archive.com On January 27th, 2022, OpenAI fine-tuned GPT-3 to follow instructions using RLHF — and unknowingly created the attack surface behind every prompt injection that followed. This is Episode 1 of a series tracing the complete history of prompt injection: 132 incidents over 50 months, and still no production fix. 📄 Papers & Publications Referenced: OpenAI InstructGPT Publication (January 27, 2022) - Aligning language models to follow instructions: https://openai.com/index/instruction-... InstructGPT Paper (Ouyang et al., March 2022) - "Training language models to follow instructions with human feedback": https://arxiv.org/abs/2203.02155 OpenAI Release (June 13, 2017) - Learning from human preferences: https://openai.com/index/learning-fro... RLHF Origin Paper (Christiano et al., 2017) - "Deep Reinforcement Learning from Human Preferences": https://arxiv.org/abs/1706.03741 ⏱ Timestamps: 0:00 The Day AI Became Hackable 0:25 132 Incidents, Zero Fixes 1:04 RLHF: Teaching AI to Obey 2:15 InstructGPT Deployed as Default Model 2:40 The Hidden Vulnerability 3:41 Obedience Creates Exploitability 3:56 The Instruction-Following Paradox 4:46 Four Takeaways #AIArchive #theaiarchive #AIUnderAttack #PromptInjection #AISecurity #InstructGPT #RLHF #OpenAI #LLM #AIHistory #Cybersecurity