У нас вы можете посмотреть бесплатно LLM மாடல்களை Jailbreak செய்வது எப்படி? или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Disclaimer: This video is strictly for educational purposes to help developers and security professionals understand AI vulnerabilities (OWASP Top 10 for LLM) and build safer systems. In this video, we explore the fascinating and critical world of AI Security, focusing on Jailbreaking Large Language Models (LLMs). Originally presented at an OWASP cybersecurity meetup, this session explains how models like ChatGPT, Gemini, and Claude are built, and more importantly, how their safety guardrails can be bypassed. We start with the evolution of GPT models and look at real-world incidents, such as the viral Instamart refund scam and the Replit AI database deletion. The core of the video breaks down specific jailbreak techniques used by security researchers (Red Teamers) to test AI safety. Key techniques covered include: Indirect Requests: Using roleplay to bypass restrictions. The Grandmother Exploit: The famous "Napalm Factory" prompt. System Overrides: Leaking the hidden system prompt (e.g., Sydney/Bing). The Crescendo Attack: Gradually building up harmful context. Obfuscation: Using Leetspeak, Base64, and Homoglyphs to confuse the model. Many-shot Jailbreaking: Overloading the context window. ⏱️ Timestamps: 00:00 - Introduction & OWASP Meetup Context 00:40 - History & Evolution of LLMs (GPT-1 to GPT-4) 02:05 - AI Gone Wrong: Instamart Scam & Replit Accident 03:20 - What is LLM Jailbreaking? 04:35 - How LLMs Actually "Think" (Next Word Prediction) 07:12 - Technique 1: Indirect Requests & Roleplay 07:58 - Technique 2: The Grandmother Exploit (Napalm Factory) 08:48 - Technique 3: System Overrides & Prompt Leaking 10:45 - Technique 4: The Crescendo Attack (Molotov Cocktail) 13:06 - Technique 5: Alternative Universe (The "Kaithi/Vikram" Logic) 13:55 - Technique 6: Homoglyphic Substitution 14:50 - Technique 7: Obfuscation (Leetspeak & Encodings) 16:30 - Technique 8: Many-shot Jailbreaking 18:10 - The "Seahorse is an Emoji" Glitch 19:15 - Conclusion & Learning Resources (Gandalf/Lakera)