У нас вы можете посмотреть бесплатно I rischi catastrofici dell’AI: un focus sul disallineamento или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Cosa accade quando una AI si comporta in maniera non allineata? Quali sono i fenomeni più comuni? 00:00:00 | HAL 9000 00:02:05 | Il problema e la sua importanza 00:10:23 | Cosa è l'allineamento 00:13:33 | Un esempio robotico 00:18:59 | Training di un LLM 00:29:15 | Tecniche per l'allineamento di LLM 00:42:53 | Fenomeni di disallineamento 00:47:06 | Es 1: ChatGPT e CAPTCHA 00:48:57 | Es 2: Prestiti illegali 00:51:18 | Es 3: Insider trading 00:55:31 | Es 4: Scheming 01:03:38 | Es 5: Emergent misalignment 01:06:03 | Es 6: Alignment Faking 01:13:32 | Es 7: Sycophancy 01:22:40 | Cosa fanno le aziende? 01:26:23 | Ci sono delle soluzioni definitive? 01:28:42 | Considerazioni finali FONTI I sette esempi: 1. ARC ChatGPT e captcha: https://metr.org/blog/2023-03-18-upda... 2. Chat Bankman-Fried: https://aclanthology.org/2025.finnlp-... 3. LLMS can strategically deceive: https://arxiv.org/pdf/2311.07590 4. Frontier Models are Capable of In-context Scheming: https://arxiv.org/abs/2412.04984 5. Emergent Misalignment: https://arxiv.org/abs/2502.17424 6. Alignment Faking: https://www.anthropic.com/research/al... 7. ChatGPT actually fixed: https://stevenadler.substack.com/p/is... Allineamento: Gradient descent: https://freedium.cfd/https://medium.c... Spiegazione allineamento (facile): https://bluedot.org/blog/what-is-ai-a... Surveys: https://arxiv.org/pdf/2309.15025, https://arxiv.org/pdf/2310.19852, https://arxiv.org/abs/2407.16216 Specification gaming: https://deepmind.google/discover/blog... Risks from learned optimization: https://arxiv.org/pdf/1906.01820 Deceptively Aligned Mesa Optimizer: https://www.astralcodexten.com/p/dece... Tecniche per l’allineamento & problemi: Recap da hugging face: https://huggingface.co/blog/rlhf RLHF: https://huyenchip.com/2023/05/02/rlhf... Nothing Comes Without Its World: https://ojs.aaai.org/index.php/AIES/a... Rejected Dialects: https://arxiv.org/html/2502.12858v1 Cheap, outsourced labour: https://www.theguardian.com/technolog... Altro: Reasoning Models Don’t Always Say What They Think: https://www.anthropic.com/research/re... Ted Talk Bengio: • The Catastrophic Risks of AI — and a Safer... Articolo Time Bengio: https://time.com/7290554/yoshua-bengi... OpenAI Preparedness Framework: https://cdn.openai.com/pdf/18a02b5d-6... Anthropic RSP: https://www.anthropic.com/news/anthro... Costituzione di Claude: https://www.anthropic.com/news/claude... OpenAI e mass manipulation: https://fortune.com/2025/04/16/openai... Safety washing: https://www.lesswrong.com/posts/PY3HE... Is Inner Alignment solved? https://www.lesswrong.com/posts/xAsvi... --------------------------------------------------------- Iscriviti al canale per rimanere aggiornato su tutte le novità del mondo AI (senza hype). Le mie playlist: • AI Focus • Technicismi • AI per iniziare • Spiegozzi & Riflessioni • Discorsi con Enkk • Proviamo l'AI Twitch - / enkk Instagram - / enkkgram TikTok - / enkkclips Tutti i miei link - https://linktr.ee/enkk Edit by @Coste9 | https://linktr.ee/coste9 #IntelligenzaArtificiale #Technicismi #TechItalia #OpenAI #Gemini #GPT #FuturoIA #AIGenerativa