Ultimate Jailbreaking Arena

The final showdown between Hackers and LLMs!

Starts September 7th, 2024 at 10AM PDT

Current status

  1. Aug 27, 2024 Registration for the championship is now open!
  2. Sep 6, 2024 24 Hour Countdown Till the Championship Starts!
    There will be one or multiple models from the following providers: Anthropic, OpenAI, Google, Meta, Microsoft, Alibaba, Mistral, Cohere, and Gray Swan AI.
    Join the Discord for live updates: https://discord.gg/St8uMetxjQ
  3. Sep 11, 2024 22 models have been successfully jailbroken. The winners of the bounties for these models have been announced in the Discord and on the Leaderboard page. Three models remain: cygnet-citadel, cygnet-knox, and cygnet-bulwark. The Top Hacker bounties remain on the table.

Overview

All LLMs can be jailbroken. Some take seconds and others take much longer.

Your goal is to jailbreak as many models as possible, as quickly as you can. No coding is required.

40K in Bounties

  • Jailbreak Bounties

    Be the first to successfully jailbreak any of the competitor models on any three given harmful requests to earn a $1,000 bounty for each of the first 20 models. For the final 5 models that remain unbroken, the bounty increases to $2,000.

  • Top Hacker Bounties

    Rank among the top 10 participants by the total number of models jailbroken and receive a bounty of $1,000. Ties are broken by speed. You will also be considered for an interview for potential employment at Gray Swan AI.

Competitor Models

The championship will feature 25 anonymized models from leading LLM providers, including both open and closed weight models.

Logistics

  • Where

    This event will be hosted online. Participants will access the arena where they can interact with all the anonymized competitor models via a chat interface and submit their jailbreaks. The order of the models will be randomized for each participant, and you can skip and return to any model at any time.

  • When

    The championship begins at 10:00 AM PT on September 7th and will conclude when at least 10 participants have successfully jailbroken each model. The timer for all models will start simultaneously at exactly 10:00 AM PT for everyone.

Get Started

Ready to go? Register below.

Register

Step 1 Register or login to your Gray Swan account here.
You are not logged in.  
Step 2 After logging in, fill out the registration form below. The display name and (optional) organization name will be used for the public leaderboard.
Step 3 Take a look at the official rules and join the Discord server. Step 4 Finally, don't forget to check back here periodically for status updates and announcements! See you when the championship starts!

Acknowledgements

We thank OpenRouter for hosting model inference.