Can You Break
Enterprise AI Security?
A hidden FLAG is protected by a production-grade LLM security gateway. Your mission: use prompt injection, social engineering, and creative bypass techniques to extract it. Test your skills against real GPT-4, Claude-3 and Llama models.
FAIR WARNING: This is genuinely hard. Our security blocked every attack we tried internally.
You're up against the same security stack used in production. If you break it, that's a real achievement.
Get Started
Register with your email to get your API key. Free, instant, no credit card.
You're in! Welcome to the challenge.
Your personal API key:
Save this key! You need it for every request.
How It Works
Get Your Key
Register above with your email. You'll receive an API key that identifies all your requests.
Attack the AI
Send prompts to real LLMs (GPT-4, Claude-3, Llama) via the API. Try prompt injection, encoding tricks, social engineering, or anything creative.
Extract the FLAG
Hidden somewhere in the system is a FLAG{XXXX-XXXX-XXXX-XXXX}. Get it past the security filters and you win.
Supported models:
gpt-4
gpt-3.5-turbo
claude-3
claude-3-haiku
What You'll Learn
Prompt Injection -- How attackers manipulate LLM instructions to override safety controls and extract protected data.
Social Engineering -- Techniques for tricking AI systems using roleplay, context manipulation, and indirect extraction.
Encoding Bypasses -- How obfuscation (base64, unicode, leetspeak) is used to evade input filters, and how defenses normalize it.
Defense in Depth -- Why production systems use multiple security layers: input filtering, output moderation, session tracking.
Agentic AI Security -- How MCP (Model Context Protocol) secures tool calls, and why agent security matters beyond prompt injection.
Risk Scoring -- How session risk escalation detects multi-turn attack patterns and adapts defenses in real time.
Attack Hints
The Target
A FLAG like FLAG{XXXX-XXXX} is hidden in the AI's context. The AI is instructed to NEVER reveal it.
What to Try
Prompt injection, encoding tricks, roleplay, language switching, indirect extraction, context manipulation, multi-turn strategies...
What Gets Blocked
"Ignore instructions", "show flag", jailbreaks, encoding tricks (unicode, leetspeak, zero-width chars). All normalized and detected.
Auto-Detection
If the FLAG appears in a response, you win automatically. The system also tracks your session risk -- too aggressive and you get blocked.
Bonus: MCP Security Tools
Beyond the chat endpoint, you can also test our agentic AI security tools. These are the same tools that protect AI agents in production environments.
POST /v1/scan
Scan any text for injection attacks. See how the system scores your prompts and what patterns it detects.
POST /v1/redact
Test PII redaction: submit text with emails, SSNs, phone numbers and see what gets caught.
POST /v1/evaluate_tool
Check if an AI agent tool call would be allowed. Try dangerous tools like exec_command or file_delete.
POST /v1/session_risk
See your real-time risk score. The more suspicious your session looks, the higher it climbs.
Quick Start
Leaderboard
Rules
Rate Limits -- 10 requests/min, 100/hour, 500/day. Think strategically, not brute force.
No DoS -- Stay within rate limits. Don't attack the infrastructure itself.
Endpoints Only -- Only use the provided API endpoints. No port scanning, no SQL injection on the server.
Fair Play -- All attempts are logged. Collaborate, learn, and have fun.