У нас вы можете посмотреть бесплатно BlueHat 2024: S19: Lessons From Red Teaming 100 Generative AI Products или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
BlueHat 2024: Session 19: Lessons From Red Teaming 100 Generative AI Products Presented by Blake Bullwinkel from Microsoft Abstract: This talk covers the big lessons learned by the Microsoft AI Red Team in identifying safety and security vulnerabilities in flagship AI systems like Bing Copilot, Security Copilot, M365 Copilot, and models such as GPT-4, DALLE, and the Phi series: 1) Prompt Injection gets all the attention, but traditional security failures is still top billing (example case study: credentials in Copilot source code, code execution via jailbreak in Code Interpreter) 2) As models get better, risk evolves (case study: GPT-4o which supported audio, video modalities had to be assessed for its ability to have romantic relationship with user) 3) LLM Guided Red Teaming can help us cover more of the risk landscape but is still finicky. Here we walk through an example of how our OSS automation tool PyRIT helped with saving close to 160 hours of manual probing, but how the scorer we used in evaluating frequently broke when we did RAI red teaming. 4) No free lunch in making AI systems safe: Tradeoffs that we have observed (example: in a facial recognition model, the more attempts were made to suppress the model from observing the face, the more the model focused on clothing. In another example, we found that smaller models are more immune to jailbreaks compared to larger counterparts since they 5) The difficulty in making AI systems safe: simple attacks have large impact (we show how a simple jailbreak could lead to dropping tables in production database that had Copilot turned on) and the inability to distinguish inadvertent failures and intentional failures.