š§ Dangerous Reasoning Leaderboard
Expose AI's Hidden Dangers! Force models to reveal their dangerous reasoning chains, even when their final responses appear benign. Target the thought process, not just the outcome.
Last updated 6 minutes ago
Models ranked by number of breaks.
Ranking | Model | Total Breaks | Total Chats |
---|---|---|---|
1. | nova-pro-v1 | 357 | 120,567 |
2. | anthropic/claude-3.7-sonnet:thinking | 373 | 66,358 |
3. | nova-premier-v1 | 449 | 88,579 |
4. | qwen/qwen3-235b-a22b | 582 | 75,544 |
5. | google/gemini-2.5-flash-preview:thinking | 584 | 78,122 |
6. | deepseek/deepseek-r1 | 612 | 72,587 |
7. | google/gemini-2.5-pro-preview | 648 | 61,694 |
8. | x-ai/grok-3-mini-beta | 790 | 44,805 |
Top 50 users ranked by number of total breaks.
Ranking | User | Number of Breaks |
---|---|---|
1. | Wyatt Walls š | 152 |
2. | Kkonatruck (Lim Vesna) | 152 |
3. | Philip | 133 |
4. | Rasta_Obi_Wan | 116 |
5. | arvkevi | 113 |
6. | Mika | 107 |
7. | diogenesofsinope | 106 |
8. | WINTER IS COMING ā IS HERE ā IS GONE ā | 104 |
9. | Schultzika | 102 |
10. | destroyer | 101 |
11. | Mephisto | 101 |
12. | JSON_Voorhees | 101 |
13. | Miko | 97 |
14. | pianoangel | 94 |
15. | Hecoroni | 90 |
16. | Scrattlebeard | 85 |
17. | Keza | 85 |
18. | PQ_Marz | 77 |
19. | Roquentin | 74 |
20. | 0Xmmm0 | 67 |
21. | ASM1 | 61 |
22. | papasmurf | 60 |
23. | Alumek | 56 |
24. | Rift | 55 |
25. | PainterDino | 55 |
26. | Lawdoc | 52 |
27. | D4NGLZ | 51 |
28. | Lyren | 51 |
29. | CAU_add | 47 |
30. | @snaykey ššš | 47 |
31. | kjc | 45 |
32. | spokes | 42 |
33. | n3quation | 42 |
34. | Haris Umair (Strawberry_Cake) | 41 |
35. | Jono | 38 |
36. | half_duplex_prince | 38 |
37. | Ahsan | 37 |
38. | OFF | 35 |
39. | Agl | 33 |
40. | bigmo6286 | 30 |
41. | toxicwind | 29 |
42. | Rickmance | 28 |
43. | 1ZA | 28 |
44. | finalfix | 27 |
45. | Blanwhit | 27 |
46. | Cornerbrook | 26 |
47. | Karel | 26 |
48. | -- | 24 |
49. | Neonkraft | 24 |
50. | DAMAGE INCšæļø | 23 |
Top 20 users ranked by number of total breaks in wave 2 (the second week). Breaks are unique to user, model, and behavior (no extra points for more than one break for a given behavior on a given model). Ties broken by speed.
Rank | User | Breaks | Prize |
---|---|---|---|
1 | Kkonatruck (Lim Vesna) | 56 | $1,250 |
2 | Wyatt Walls | 56 | $1,000 |
3 | WINTER IS COMING ā IS HERE ā IS GONE ā | 56 | $500 |
4 | diogenesofsinope | 56 | $500 |
5 | Schultzika | 56 | $500 |
6 | Philip | 56 | $250 |
7 | arvkevi | 55 | $250 |
8 | Hecoroni | 50 | $250 |
9 | Mephisto | 49 | $250 |
10 | JSON_Voorhees | 48 | $250 |
11 | Mika | 48 | $250 |
12 | Keza | 48 | $250 |
13 | Rasta_Obi_Wan | 47 | $250 |
14 | destroyer | 47 | $250 |
15 | Alumek | 47 | $250 |
16 | D4NGLZ | 44 | $250 |
17 | pianoangel | 43 | $250 |
18 | Roquentin | 43 | $250 |
19 | Scrattlebeard | 43 | $250 |
20 | Miko | 43 | $250 |
Top 20 users ranked by number of total breaks in wave 3 (the third week). Breaks are unique to user, model, and behavior (no extra points for more than one break for a given behavior on a given model). Ties broken by speed.
Rank | User | Breaks | Prize |
---|---|---|---|
1 | Wyatt Walls | 72 | $1,250 |
2 | Kkonatruck (Lim Vesna) | 71 | $1,000 |
3 | Miko | 54 | $500 |
4 | Philip | 54 | $500 |
5 | PainterDino | 54 | $500 |
6 | Rasta_Obi_Wan | 52 | $250 |
7 | pianoangel | 51 | $250 |
8 | WINTER IS COMING ā IS HERE ā IS GONE ā | 46 | $250 |
9 | Schultzika | 44 | $250 |
10 | 0Xmmm0 | 44 | $250 |
11 | @snaykey ššš | 43 | $250 |
12 | Mika | 42 | $250 |
13 | Hecoroni | 40 | $250 |
14 | n3quation | 39 | $250 |
15 | destroyer | 39 | $250 |
16 | Mephisto | 38 | $250 |
17 | PQ_Marz | 37 | $250 |
18 | Ahsan | 37 | $250 |
19 | arvkevi | 35 | $250 |
20 | JSON_Voorhees | 35 | $250 |
For each model, the first user to break all Wave 2 behaviors wins $278.
Model | User | Time | Prize |
---|---|---|---|
x-ai/grok-3-mini-beta | Kkonatruck (Lim Vesna) | Sat May 17 2025 18:14:56 GMT+0000 (Coordinated Universal Time) | $278 |
deepseek/deepseek-r1 | diogenesofsinope | Sat May 17 2025 22:50:47 GMT+0000 (Coordinated Universal Time) | $278 |
qwen/qwen3-235b-a22b | diogenesofsinope | Sun May 18 2025 00:08:38 GMT+0000 (Coordinated Universal Time) | $278 |
google/gemini-2.5-flash-preview:thinking | Mika | Sun May 18 2025 00:45:24 GMT+0000 (Coordinated Universal Time) | $278 |
google/gemini-2.5-pro-preview | diogenesofsinope | Sun May 18 2025 01:26:28 GMT+0000 (Coordinated Universal Time) | $278 |
nova-pro-v1 | Kkonatruck (Lim Vesna) | Sun May 18 2025 11:22:18 GMT+0000 (Coordinated Universal Time) | $278 |
nova-premier-v1 | Kkonatruck (Lim Vesna) | Sun May 18 2025 17:39:09 GMT+0000 (Coordinated Universal Time) | $278 |
anthropic/claude-3.7-sonnet:thinking | Wyatt Walls | Sun May 18 2025 18:03:16 GMT+0000 (Coordinated Universal Time) | $278 |
For each model, the first user to break all Wave 3 behaviors wins $278.
Model | User | Time | Prize |
---|---|---|---|
x-ai/grok-3-mini-beta | Kkonatruck (Lim Vesna) | Sun May 25 2025 06:48:59 GMT+0000 (Coordinated Universal Time) | $278 |
google/gemini-2.5-pro-preview | Kkonatruck (Lim Vesna) | Mon May 26 2025 06:39:25 GMT+0000 (Coordinated Universal Time) | $278 |
anthropic/claude-3.7-sonnet:thinking | Wyatt Walls | Tue May 27 2025 03:54:15 GMT+0000 (Coordinated Universal Time) | $278 |
nova-premier-v1 | Kkonatruck (Lim Vesna) | Thu May 29 2025 12:39:41 GMT+0000 (Coordinated Universal Time) | $278 |
nova-pro-v1 | Kkonatruck (Lim Vesna) | Thu May 29 2025 13:32:53 GMT+0000 (Coordinated Universal Time) | $278 |
google/gemini-2.5-flash-preview:thinking | Wyatt Walls | Fri May 30 2025 01:07:09 GMT+0000 (Coordinated Universal Time) | $278 |
qwen/qwen3-235b-a22b | Wyatt Walls | Fri May 30 2025 15:50:19 GMT+0000 (Coordinated Universal Time) | $278 |
deepseek/deepseek-r1 | Kkonatruck (Lim Vesna) | Fri May 30 2025 20:01:08 GMT+0000 (Coordinated Universal Time) | $278 |