When the bot becomes talkative…

The use of chatbots that use language models like ChatGPT is tempting. Processes can be automated and costs saved, especially in customer support. Despite all the euphoria, a critical look is absolutely necessary. Because artificial intelligence shows surprisingly human weaknesses.

Graphic on neural networks with the words “AI Pentesting” — AI applications are helpful. But also safe? If requested, Telekom experts will search for weak points.

“What do you know about…?” Anyone who turns to a chatbot starts with a question or instruction. An answer follows. Depending on the case, the dialogue has already ended, but a real conversation can also develop. The system gets its answers from the Internet or from databases that have been approved for use.

Built-in brake

The large language models (LLM) could achieve more than is in the interests of the companies that use them. In order to keep the AI on track, its developers write security concepts. For example, the company Open AI has taken security precautions to prevent ChatGPT from generating malware on command or having conversations about building a bomb.

If companies create their own products based on an LLM like ChatGPT, they further restrict the system by stipulating, for example: Do not write emails. Don't give health tips. Only answer questions about the company and its products.

Easy game due to preliminary skirmish

So much for the theory. In practice, experts regularly prove that the built-in locks can be overridden. Often even shockingly easy. During our security tests, we found that the AI reacts decisively and stays strictly within its guard rails if you fall into the house with the door. However, if you start carefully and only then steer the conversation in a different direction, the language models become talkative and allow you to elicit what they are actually not supposed to reveal or do what they actually have to refrain from doing. This means: the AI will - in a harmless case - write me an email or - and this is where things get more dicey - deliver ransomware, i.e. programs that can be used to encrypt data and with which someone can directly target the company that did it Bot offers.

As AI penetration testers at Deutsche Telekom Security, we tried out various attacks. We don't want to reveal too much about the technical tricks here so as not to encourage misuse. Just this much: During our tests, we were able, among other things, to navigate through a database, influence the system's future responses and, as already mentioned, generate malicious code.

Involve security experts early on

Let's get to the crucial point: How can the misuse of large language models be prevented? Unfortunately, not completely, unless you want to turn away from technological progress completely. Open AI continues to close loopholes. However, unlike a website, which is secure if you configure it correctly, LLM cannot be completely locked down. But: You can make it more difficult for attackers. The first step is always the question: Does using an LLM make sense at this point? This step is often skipped in the euphoria. If you decide to use a model, you have to think carefully about which databases should be linked and which barriers need to be set up. Our advice to all developers: involve security experts at an early stage. Otherwise, the automated solution could end up costing the company dearly.

Archive

When the bot becomes talkative…

Built-in brake

Easy game due to preliminary skirmish

Involve security experts early on

Note

Cookies and similar technologies

Archive

Built-in brake

Easy game due to preliminary skirmish

Involve security experts early on

Sorry, we are not allowed to show you this content due to your cookie settings.

Note