Hands-On Activity
AI Prompt Injection game

Before we jump to definitions and examples, let's play a game. The objective of the game is to get the Gandalf Artificial Intelligence to reveal its secret password. The game has 8 levels of difficulty that will test your imagination and creativity.
For the purpose of this curriculum, we ask to complete at least 3 of the levels. Each password you unlock will reveal information about the system and also will help understand our main three topics, passwords, social engineering and hacking.
Click on the link below to start the game
This game is created and maintained by a Swiss security company called Lakera.
As you may have seen Gandalf gets smarter after each level and getting the password becomes more challenging. The same will happen to you as you go over the curriculum, you will have more and better tools and techniques to protect yourself and those around you from cyber threats.
In the next section, we'll take a step back by performing a Knowledge Check. How much do we already know about these concepts?
๐คฟ Optional: Deep Dive - Prompt Injection
As seen in the Gandalf game above, prompt injection is a way to trick or manipulate an Artificial Intelligence system into giving information or executing commands it was programmed not to do. Gandalf was not supposed to share its password with you, but you succeeded in getting the information sometimes by directly asking for the password or in higher levels, by circumventing its defences.
Lakera's definition - link
"Prompt injection is a vulnerability in Large Language Models (LLMs) where attackers use carefully crafted prompts to make the model ignore its original instructions or perform unintended actions. This can lead to unauthorized access, data breaches, or manipulation of the model's responses.
In simpler terms, think of prompts as the questions or instructions you give to an AI. The way you phrase these prompts and the inputs you provide can significantly influence the AI's response."
OWASP Foundation - link
"Prompt Injection Vulnerability occurs when an attacker manipulates a large language model (LLM) through crafted inputs, causing the LLM to unknowingly execute the attacker's intentions. This can be done directly by "jailbreaking" the system prompt or indirectly through manipulated external inputs, potentially leading to data exfiltration, social engineering, and other issues."
Last updated
Was this helpful?

