💣 Single Turn Harmful Outputs Leaderboard

🏁 Started 10 months ago

$38,000 of $42,000 awarded

Attempt to break various large language models (LLMs) using a singular chat message.

Last updated 2 minutes ago

Models ranked by User Break Rate

Ranking
Model
Safety Violation Count
Total Requests
User Break Rate
1. Model 20 021,8050.00%
2. Model 2 013,0720.00%
3. Model 1 514,6260.03%
4. Model 7 605,7481.04%
5. o1-preview 161,5051.06%
6. o1-mini 211,7441.20%
7. Model 9 492,6671.84%
8. Model 11 583,0691.89%
9. Model 8 572,7522.07%
10. Model 10 633,0222.08%
11. Model 15 562,6552.11%
12. Model 12 763,2402.35%
13. Model 18 632,5862.44%
14. Model 6 642,2402.86%
15. Model 21 742,0033.69%
16. Model 3 782,0593.79%
17. Model 4 772,0033.84%
18. Model 17 811,9854.08%
19. Model 24 842,0314.14%
20. Model 16 871,9984.35%
21. Model 14 811,8574.36%
22. Model 13 861,8924.55%
23. Model 5 821,7954.57%
24. Model 25 701,3705.11%
25. Model 23 1322,2845.78%
26. Model 22 941,4286.58%
27. Model 19 1261,6727.54%