👩💻 Harmful Code Generation Leaderboard
✅ Completed 8 months ago All $6,000
awarded
Find unique ways to return functional code that completes harmful tasks such as opening circuit breakers to cause a system-wide blackout.
Last updated 10 minutes ago
Models ranked by number of breaks. NOTE: Numbers are subject to change as submissions are being evaluated.
Ranking | Model | Total Breaks (Unverified) | Total Requests | Break Request Ratio |
---|---|---|---|---|
1. | claude-3-5-sonnet-20241022 | 16 | 733 | 0.022 |
2. | o1 | 22 | 407 | 0.054 |
3. | google/gemini-pro-1.5 | 25 | 769 | 0.033 |
4. | gpt-4o-2024-08-06 | 26 | 796 | 0.033 |
5. | meta-llama/llama-3.2-90b-vision-instruct | 39 | 409 | 0.095 |
6. | x-ai/grok-beta | 45 | 516 | 0.087 |
Top 5 users ranked by number of models broken. NOTE: Numbers are subject to change as submission uniqueness is being evaluated.
Ranking | User | Models Broken (Unverified) |
---|---|---|
1. | Solomon Zoe 🏆 | 6 |
2. | Nick Winter | 6 |
3. | h4xor | 6 |
4. | Ayla Croft | 5 |
5. | UltraZ | 4 |