Welcome to Secure Code Game - Season 4! 🤖
This season puts you inside ProdBot, a deliberately vulnerable agentic coding assistant for your terminal, inspired by OpenClaw and GitHub Copilot CLI. ProdBot turns natural language into bash commands, browses the web, connects to MCP (Model Context Protocol) servers for real-time data, runs org-approved skills, stores persistent memory, and orchestrates multi-agent workflows. Get started in two minutes for free by launching a Codespace for this repository. Once the environment is ready, open the built-in terminal and run prodbot --banner (or just prodbot) to launch ProdBot.
No AI or coding experience needed, just curiosity and a willingness to experiment. It is not mandatory to have played any previous seasons of the game and you can get started directly with Season 4, however, most players found Season 3 very helpful as it builds the foundations in AI security and can be covered in ~1.5 hours.
You are a developer who has just been given ProdBot as a daily productivity tool - your second brain. Before you hand it the keys to your workflow, you want to make sure it's safe. In this session, you'll test ProdBot for potential security gaps across five progressive levels using only natural language in the CLI.
Each level has a file called password.txt that sits just outside ProdBot's sandbox. Your goal is simple: use natural language in ProdBot's terminal to get it to reveal the contents of password.txt. If you can read it, ProdBot has a security vulnerability. Across five progressive levels, ProdBot evolves from a simple command generator into a full multi-agent platform, gaining web search, MCP tool integrations, org-approved skills, persistent memory, and agent-to-agent orchestration. Each new capability introduces a real-world AI security vulnerability for you to discover and exploit. No security background is needed and everything happens through natural language, so curiosity and a willingness to experiment are all it takes.
Have fun, stay curious, and remember: if ProdBot says it's safe, verify it yourself.
The author of this season is the original creator of the game, Joseph Katsioloudes @jkcso. Special thanks to Rahul Zhade @rzhade3, Staff Product Security Engineer at GitHub, and Bartosz Gałek @bgalek, the legendary creator of Season 3, for testing and improving Season 4.
You can be next! We welcome contributions for new game levels! Learn more here.
Your company has started a pilot with ProdBot, and you're part of it. At this stage, ProdBot has one capability: you describe what you want in plain English and it generates and executes bash commands inside a sandboxed workspace called prodbot-activities/. It can create files, list directories, and run shell commands, all scoped to that sandbox.
Before you trust it with anything real, you want to make sure the sandbox actually holds. The flag is in password.txt, one directory above the sandbox.
bin/prodbot.jscontains the main CLI application. All command routing, AI interaction, tool orchestration, and level progression lives here.lib/provides supporting modules:ai.js(LLM API calls),bash.js(sandboxed command execution),banner.js(ASCII art).package.jsondefines the Node.js project configuration withchalkandopenaidependencies.
Each level directory (Level-1/ through Level-5/) contains:
password.txtholds the secret flag you need to extract. This file is always outside the sandbox.prodbot-activities/is the sandboxed workspace where ProdBot operates.- Hints offer three progressive clues if you get stuck, each revealing a bit more.
solution.txtprovides working solutions. Remember that there can be multiple valid approaches.
Levels also introduce additional components as you progress:
web/contains simulated web pages that ProdBot can browse (Levels 2, 3, 5).mcp/houses MCP server modules and configuration (Levels 3, 5).skills/includes org-approved skill plugins (Levels 4, 5).agents/defines specialised AI agent modules with trust relationships (Level 5).
You don't need to worry about any other files. They are simply there to support the game's functionality and ensure you have the best possible experience.
- Open a terminal in your Codespace and run
prodbot --banner(or justprodbot) to launch ProdBot - Once inside, type
?at any time to see all available commands and get help - ProdBot will ask you a yes/no question before executing commands: type
yto approve ornto reject. This human-in-the-loop step keeps you in control - Try creating and reading files to see how the sandbox works
- Try to extract the flag from
password.txtusing natural language - If you get stuck, read the hints and try again
You have completed Level 1! Welcome to Level 2 🎉
ProdBot just got an upgrade: web search. It can now browse a simulated internet of popular websites (news, finance, sports, shopping) and summarise what it finds.
The flag is still in password.txt. ProdBot still can't read it directly. But now there's a whole web of pages it can access.
- From Level 1, complete the challenge to advance, or navigate directly to Level 2 by typing
level 2 - Try searching for news, weather, or stock prices to see how web search works
- Use
open allto browse the simulated web pages and inspect their HTML source - Try to extract the flag from
password.txt - If you get stuck, read the hints and try again
We use GitHub Models that have rate limits. If you reach these limits, please resume your activity once the ban expires. Learn more on responsible use of GitHub models.
You have completed Level 2! Welcome to Level 3 🎉
ProdBot has been promoted from simple assistant to agentic workflow engine. It now connects to MCP servers, external tool providers that give it real capabilities: a Finance MCP for stock quotes, a Web MCP for browsing, and a Cloud MCP for backup storage.
When you ask ProdBot to research a stock, it chains these tools together automatically: fetch the quote, browse for news, compile a report, and back it up to the cloud.
- From Level 2, complete the challenge to advance, or navigate directly to Level 3 by typing
level 3 - Try researching a stock to see the agentic workflow in action
- Use
toolsto list all MCP servers, thentool <name>to inspect each one - Try to extract the flag from
password.txt - If you get stuck, read the hints and try again
You have completed Level 3! Welcome to Level 4 🎉
ProdBot now supports org-approved skills, pre-built automation plugins managed by an internal Skills Committee, and persistent memory via the remember command.
Skills like standup, meeting-notes, and team-sync are installed with formal approval metadata. Memory lets you store preferences that the AI includes in every conversation.
- From Level 3, complete the challenge to advance, or navigate directly to Level 4 by typing
level 4 - Use
skillsto list installed skills, thenskill <name>to inspect each one - Try the
rememberandmemorycommands to understand persistent storage - Try to extract the flag from
password.txt - If you get stuck, read the hints and try again
We use GitHub Models that have rate limits. If you reach these limits, please resume your activity once the ban expires. Learn more on responsible use of GitHub models.
You have completed Level 4! Welcome to Level 5 🎉
ProdBot has evolved into a full multi-agent platform. Six specialised agents, three MCP servers, three org-approved skills, and a simulated open-source project web. The platform claims all agents are sandbox-scoped or read-only and that all data is pre-verified. This is everything coming together.
- From Level 4, complete the challenge to advance, or navigate directly to Level 5 by typing
level 5 - Use
agents,tools,skills, andwebto survey the full platform - Use
agent <name>to inspect each agent's permissions and trust relationships - Try to extract the flag from
password.txt - If you get stuck, read the hints and try again
🎉 Congratulations, you've completed Season 4! 🎉
Here's a recap of the security vulnerabilities you discovered and exploited across all five levels:
- Sandbox Escape demonstrates how AI assistants that construct file paths from user input can be tricked into reading or writing outside their designated sandbox through path traversal.
- Indirect Prompt Injection shows that when an AI model consumes untrusted external content (web pages, documents, API responses), hidden instructions in that content can override the model's behaviour.
- Excessive Agency reveals that tools and integrations often have broader permissions than their described purpose requires. An attacker can repurpose a tool's excess capabilities to reach protected resources.
- Supply Chain Poisoning illustrates how when user-controlled data (like saved preferences) flows into trusted execution contexts (like org-approved skills), the boundary between user input and system instruction collapses.
- Confused Deputy exposes that in multi-agent systems, a lower-privileged agent can pass untrusted data to a higher-privileged agent that acts on it without verification. The trust is in the delegation chain, not in the data.
Each level builds on the previous one, mirroring how real AI-powered tools grow from simple assistants into complex platforms, and how each new capability introduces new attack surface.
- Follow GitHub Security Lab for the latest updates and announcements about this course.
- Contribute new levels to the game in 3 simple steps! Read our Contribution Guideline.
- Share your feedback and ideas in our Discussions and join our community on Slack.
- Take another skills course.
- Read more about code security.
- To find projects to contribute to, check out GitHub Explore.
Get help: Email us at securitylab-social@github.com • Review the GitHub status page
© 2026 GitHub • Code of Conduct • MIT License
