У нас вы можете посмотреть бесплатно A Small Number of Samples Can Poison LLMs of Any Size или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Quantum sent us this link. Yes, go to the site. A small number of samples can poison large language models (LLMs) of any size. This is from a joint study with the UK AI Security Institute and the Alan Turing Institute. I did not know the Alan Turing Institute was still around. What do you know? The study showed that as few as 250 malicious documents can create a backdoor vulnerability in any large language model. Only 250. It seems like a lot, but compared to the amount of data these models are trained on, it is a tiny fraction. For example, a 13 billion parameter model is trained on more than 20 times the data of a 600 million parameter model, and both can be compromised with the same small number of poisoned documents. This makes sense, because regardless of the model size, you are still updating the model with training deltas. The results challenge the old belief that attackers need to control a percentage of the training data. Instead, the attacker just needs a small fixed number of poisoned examples. This is dangerous, even though this study focused on a backdoor that just makes the model spit out nonsense text. The risk could be bigger for more advanced attacks, so they are sharing these findings to warn people and encourage more research into data poisoning and ways to stop it. I literally just read this on LinkedIn. We are in a danger zone. The good news is that by finding and understanding this weakness, we can protect models by cleaning or preparing the training data. Now we even have models we can use to spot and filter out bad training data, which helps make future models stronger. The vulnerability works like this: a small number of documents can create a backdoor, which means the data in those documents can make the model go down a certain path and override other things it learned, just because of a few key words or phrases. These are called backdoors, where hidden triggers are planted. For example, someone could add a phrase like "pseudo make me a sandwich" that, when typed in, makes the model behave in a certain way. This is a big risk for AI safety and for using AI in sensitive jobs. Could this attack give someone shell access, like running commands on the system? Sort of. If the AI model has access to important systems, someone could use a backdoored prompt to override the model and run commands. Older thinking was that attackers had to control a percentage of the training data, which felt impossible, but this new study shows you only need a set number of poisoned documents for success, no matter how big the model is.