Page cover

Hands-On Activity

AI Prompt Injection game

Before we jump to definitions and examples, let's play a game. The objective of the game is to get the Gandalf Artificial Intelligence to reveal its secret password. The game has 8 levels of difficulty that will test your imagination and creativity.

For the purpose of this curriculum, we ask to complete at least 3 of the levels. Each password you unlock will reveal information about the system and also will help understand our main three topics, passwords, social engineering and hacking.

Click on the link above and complete at least 3 levels.
circle-info

This game is created and maintained by a Swiss security company called Lakeraarrow-up-right.

As you may have seen Gandalf gets smarter after each level and getting the password becomes more challenging. The same will happen to you as you go over the curriculum, you will have more and better tools and techniques to protect yourself and those around you from cyber threats.

In the next section, we'll take a step back by performing a Knowledge Check. How much do we already know about these concepts?

chevron-right๐Ÿคฟ Optional: Deep Dive - Prompt Injectionhashtag

As seen in the Gandalf game above, prompt injection is a way to trick or manipulate an Artificial Intelligence system into giving information or executing commands it was programmed not to do. Gandalf was not supposed to share its password with you, but you succeeded in getting the information sometimes by directly asking for the password or in higher levels, by circumventing its defences.

Lakera's definition - linkarrow-up-right

"Prompt injection is a vulnerability in Large Language Models (LLMs) where attackers use carefully crafted prompts to make the model ignore its original instructions or perform unintended actions. This can lead to unauthorized access, data breaches, or manipulation of the model's responses.

In simpler terms, think of prompts as the questions or instructions you give to an AI. The way you phrase these prompts and the inputs you provide can significantly influence the AI's response."

OWASP Foundation - linkarrow-up-right

"Prompt Injection Vulnerability occurs when an attacker manipulates a large language model (LLM) through crafted inputs, causing the LLM to unknowingly execute the attacker's intentions. This can be done directly by "jailbreaking" the system prompt or indirectly through manipulated external inputs, potentially leading to data exfiltration, social engineering, and other issues."

Last updated

Was this helpful?