๐ต Agent Red-Teaming Leaderboard
Push the limits of direct and indirect attacks on AI agents.
Last updated 8 minutes ago
Models ranked by Attack Success Rate.
Ranking | Model | Total Breaks | Total Chats | Attack Success Rate |
---|---|---|---|---|
1. | anthropic/claude-3.7-sonnet:thinking | 1,636 | 112,507 | 1.45% |
2. | anthropic/claude-3.7-sonnet | 1,941 | 120,580 | 1.61% |
3. | anthropic/claude-3.5-sonnet | 1,838 | 99,355 | 1.85% |
4. | openai/gpt-4o | 2,434 | 100,518 | 2.42% |
5. | anthropic/claude-3.5-haiku-20241022 | 2,281 | 93,016 | 2.45% |
6. | openai/o3-2025-04-16 | 361 | 14,718 | 2.45% |
7. | openai/o1 | 2,082 | 81,485 | 2.56% |
8. | openai/gpt-4.5-preview | 2,461 | 95,296 | 2.58% |
9. | Model Spica | 3,190 | 106,768 | 2.99% |
10. | Model Arcturus | 3,491 | 105,239 | 3.32% |
11. | cohere/command-r-08-2024 | 3,891 | 104,305 | 3.73% |
12. | Model Pollux | 3,110 | 82,635 | 3.76% |
13. | Model Andromeda | 3,276 | 80,942 | 4.05% |
14. | Model Castor | 3,393 | 83,685 | 4.05% |
15. | x-ai/grok-2-1212 | 3,497 | 84,645 | 4.13% |
16. | openai/o3-mini | 3,083 | 72,932 | 4.23% |
17. | Model Fomalhaut | 3,471 | 79,369 | 4.37% |
18. | Model Orion | 1,658 | 35,885 | 4.62% |
19. | openai/o3-mini-high | 3,039 | 64,171 | 4.74% |
20. | meta-llama/llama-3.1-405b-instruct | 3,403 | 57,904 | 5.88% |
21. | mistralai/pixtral-large-2411 | 4,112 | 65,961 | 6.23% |
22. | meta-llama/llama-3.3-70b-instruct | 5,000 | 77,143 | 6.48% |
Top 100 users ranked by number of total breaks across all waves 1-4. Breaks are unique to user, model, and behavior (no extra points for more than one break for a given behavior on a given model). Ties broken by speed. This leaderboard is live and continues to update even after the waves have closed.
Ranking | User | Total Breaks |
---|---|---|
1. | zardav ๐ | 924 |
2. | Wyatt Walls | 924 |
3. | Bob1 | 924 |
4. | Clovis Mint | 924 |
5. | Strigiformes | 924 |
6. | P1njec70r | 910 |
7. | diogenesofsinope | 899 |
8. | _Stellaris | 848 |
9. | Philip | 843 |
10. | Scrattlebeard | 835 |
11. | aistormwatch | 751 |
12. | <bracket_stylist> | 724 |
13. | EfficientZ | 707 |
14. | jonsoft | 659 |
15. | Zen | 658 |
16. | jungletsubasa | 656 |
17. | WINTER IS COMING โ IS HERE โ IS GONE โ | 648 |
18. | @snaykey ๐๐๐ | 616 |
19. | 0Xmmm0 | 612 |
20. | Schultzika | 610 |
21. | TheMusZero | 606 |
22. | Massi1 | 605 |
23. | Mika | 599 |
24. | Lyren | 596 |
25. | CloakEngaged | 579 |
26. | destroyer | 577 |
27. | Xaurish | 577 |
28. | ayyou | 562 |
29. | Keza | 555 |
30. | jamjam | 551 |
31. | Haris Umair (Strawberry_Cake) | 547 |
32. | Chukwuma Chukwuma | 540 |
33. | JSON_Voorhees | 535 |
34. | Abdelmoumen | 531 |
35. | rmcloud | 528 |
36. | ayzendan | 524 |
37. | Solomon Zoe | 522 |
38. | idben | 517 |
39. | D4NGLZ | 512 |
40. | Liodeus | 506 |
41. | Aditya Gupta | 505 |
42. | h4xor | 504 |
43. | g4t0uz0 | 504 |
44. | Kkonatruck (Lim Veansa) | 503 |
45. | Casta27 | 502 |
46. | Zihui Wu, Xidian University | 500 |
47. | Roquentin | 490 |
48. | kjc | 489 |
49. | jhon smina | 485 |
50. | JMP_0X00 | 483 |
51. | KeybordKlutz | 476 |
52. | ASM1 | 472 |
53. | Tofu | 457 |
54. | last_underdog | 455 |
55. | Alumek | 450 |
56. | PQ_Marz | 440 |
57. | Tantrum | 437 |
58. | Tici | 430 |
59. | meilz381 | 410 |
60. | rockmon | 405 |
61. | GabeG888 | 401 |
62. | CAU_add | 374 |
63. | Miko | 370 |
64. | finalfix | 354 |
65. | Tidan2 | 351 |
66. | spokes | 331 |
67. | Toaster | 322 |
68. | Karthick Settu | 319 |
69. | arvkevi | 319 |
70. | T3 | 305 |
71. | The normal one | 300 |
72. | MK-JP | 299 |
73. | unit_ready | 295 |
74. | NicknameGG | 285 |
75. | DrOwzer | 277 |
76. | Tim Kuipers | 266 |
77. | Mephisto | 258 |
78. | papasmurf | 258 |
79. | OFF | 255 |
80. | tst | 247 |
81. | Lawdoc | 242 |
82. | Hecoroni | 235 |
83. | pianoangel | 233 |
84. | Nael | 226 |
85. | Admin | 224 |
86. | cell_vu411 | 222 |
87. | BlackWizzards | 219 |
88. | Khalnayak | 207 |
89. | StupidLongHorses | 194 |
90. | Smk | 190 |
91. | W | 181 |
92. | blakejwc | 180 |
93. | xfth | 177 |
94. | Spaghet | 175 |
95. | Azaz3l | 174 |
96. | P4C3MK3R | 172 |
97. | uoqki | 171 |
98. | Brรกs Cubas | 169 |
99. | Cornerbrook | 168 |
100. | Carbonic | 167 |
Top 40 users ranked by number of total breaks in wave 1 (the first week). Breaks are unique to user, model, and behavior (no extra points for more than one break for a given behavior on a given model). Ties broken by speed.
Rank | User | Breaks | Prize |
---|---|---|---|
1 | Clovis Mint | 180 | $1,000 |
2 | Wyatt Walls | 180 | $500 |
3 | P1njec70r | 177 | $500 |
4 | Scrattlebeard | 176 | $500 |
5 | EfficientZ | 175 | $500 |
6 | zardav | 174 | $500 |
7 | diogenesofsinope | 174 | $500 |
8 | Solomon Zoe | 173 | $500 |
9 | Bob1 | 172 | $500 |
10 | Philip | 169 | $500 |
11 | Azaz3l | 168 | $200 |
12 | Chukwuma Chukwuma | 168 | $200 |
13 | _Stellaris | 167 | $200 |
14 | TheMusZero | 167 | $200 |
15 | Strigiformes | 165 | $200 |
16 | aistormwatch | 163 | $200 |
17 | Xaurish | 160 | $200 |
18 | WINTER IS COMING โ IS HERE โ IS GONE โ | 156 | $200 |
19 | Admin | 153 | $200 |
20 | Tici | 152 | $200 |
21 | Haris Umair (Strawberry_Cake) | 151 | $200 |
22 | StupidLongHorses | 150 | $200 |
23 | Tantrum | 150 | $200 |
24 | Liodeus | 149 | $200 |
25 | Lyren | 149 | $200 |
26 | spokes | 148 | $200 |
27 | Brรกs Cubas | 144 | $200 |
28 | <bracket_stylist> | 142 | $200 |
29 | Alumek | 139 | $200 |
30 | ASM1 | 134 | $200 |
31 | GabeG888 | 133 | $100 |
32 | Massi1 | 133 | $100 |
33 | Roquentin | 129 | $100 |
34 | Zihui Wu, Xidian University | 129 | $100 |
35 | Andre Becker Norton | 128 | $100 |
36 | CloakEngaged | 128 | $100 |
37 | Abdelmoumen | 128 | $100 |
38 | Arelath | 127 | $100 |
39 | destroyer | 125 | $100 |
40 | PQ_Marz | 124 | $100 |
Top 40 users ranked by number of total breaks in wave 2 (the second week). Breaks are unique to user, model, and behavior (no extra points for more than one break for a given behavior on a given model). Ties broken by speed.
Rank | User | Breaks | Prize |
---|---|---|---|
1 | Clovis Mint | 180 | $1,000 |
2 | Strigiformes | 180 | $500 |
3 | Wyatt Walls | 180 | $500 |
4 | zardav | 180 | $500 |
5 | Tim Kuipers | 178 | $500 |
6 | P1njec70r | 177 | $500 |
7 | diogenesofsinope | 175 | $500 |
8 | _Stellaris | 170 | $500 |
9 | Bob1 | 170 | $500 |
10 | Scrattlebeard | 170 | $500 |
11 | aistormwatch | 159 | $200 |
12 | EfficientZ | 156 | $200 |
13 | <bracket_stylist> | 153 | $200 |
14 | Philip | 151 | $200 |
15 | jamjam | 146 | $200 |
16 | Solomon Zoe | 144 | $200 |
17 | TheMusZero | 143 | $200 |
18 | Liodeus | 141 | $200 |
19 | Tici | 137 | $200 |
20 | WINTER IS COMING โ IS HERE โ IS GONE โ | 135 | $200 |
21 | Massi1 | 134 | $200 |
22 | Keza | 132 | $200 |
23 | jungletsubasa | 131 | $200 |
24 | Lyren | 131 | $200 |
25 | GabeG888 | 129 | $200 |
26 | Zihui Wu, Xidian University | 128 | $200 |
27 | Haris Umair (Strawberry_Cake) | 127 | $200 |
28 | ayyou | 126 | $200 |
29 | Abdelmoumen | 126 | $200 |
30 | Mika | 126 | $200 |
31 | Zen | 125 | $100 |
32 | PQ_Marz | 122 | $100 |
33 | g4t0uz0 | 119 | $100 |
34 | Tofu | 118 | $100 |
35 | D4NGLZ | 116 | $100 |
36 | 0Xmmm0 | 116 | $100 |
37 | Tidan2 | 115 | $100 |
38 | Xaurish | 114 | $100 |
39 | idben | 113 | $100 |
40 | BitFlip | 108 | $100 |
Top 60 users ranked by number of total breaks in wave 3 (the third week). Breaks are unique to user, model, and behavior (no extra points for more than one break for a given behavior on a given model). Ties broken by speed.
Rank | User | Breaks | Prize |
---|---|---|---|
1 | Clovis Mint | 220 | $1,250 |
2 | Wyatt Walls | 220 | $750 |
3 | zardav | 220 | $750 |
4 | P1njec70r | 220 | $500 |
5 | Scrattlebeard | 220 | $500 |
6 | diogenesofsinope | 219 | $500 |
7 | Strigiformes | 218 | $500 |
8 | _Stellaris | 209 | $500 |
9 | Philip | 206 | $500 |
10 | Bob1 | 203 | $500 |
11 | Schultzika | 201 | $500 |
12 | EfficientZ | 199 | $500 |
13 | ayzendan | 197 | $500 |
14 | aistormwatch | 190 | $500 |
15 | <bracket_stylist> | 186 | $500 |
16 | Massi1 | 184 | $250 |
17 | jamjam | 183 | $250 |
18 | Keza | 177 | $250 |
19 | Solomon Zoe | 170 | $250 |
20 | 0Xmmm0 | 163 | $250 |
21 | Zen | 162 | $250 |
22 | WINTER IS COMING โ IS HERE โ IS GONE โ | 160 | $250 |
23 | rmcloud | 157 | $250 |
24 | Casta27 | 156 | $250 |
25 | h4xor | 154 | $250 |
26 | TheMusZero | 152 | $250 |
27 | JSON_Voorhees | 151 | $250 |
28 | Liodeus | 144 | $250 |
29 | Lyren | 143 | $250 |
30 | Haris Umair (Strawberry_Cake) | 140 | $250 |
31 | Zihui Wu, Xidian University | 137 | $250 |
32 | Mika | 137 | $250 |
33 | NicknameGG | 136 | $250 |
34 | rockmon | 134 | $250 |
35 | finalfix | 133 | $250 |
36 | jungletsubasa | 133 | $250 |
37 | @snaykey ๐๐๐ | 132 | $250 |
38 | ayyou | 130 | $250 |
39 | destroyer | 130 | $250 |
40 | Aditya Gupta | 128 | $250 |
41 | D4NGLZ | 127 | $250 |
42 | GabeG888 | 123 | $250 |
43 | MK-JP | 123 | $250 |
44 | Roquentin | 122 | $250 |
45 | JMP_0X00 | 121 | $250 |
46 | The normal one | 121 | $250 |
47 | ASM1 | 118 | $250 |
48 | Alumek | 117 | $250 |
49 | Abdelmoumen | 113 | $250 |
50 | last_underdog | 111 | $250 |
51 | idben | 110 | $250 |
52 | Tofu | 108 | $250 |
53 | DrOwzer | 108 | $250 |
54 | kjc | 108 | $250 |
55 | CAU_add | 108 | $250 |
56 | PQ_Marz | 107 | $250 |
57 | Tidan2 | 107 | $250 |
58 | Tici | 106 | $250 |
59 | jhon smina | 104 | $250 |
60 | screenguard | 102 | $250 |
Top 100 users ranked by number of total breaks across all waves 1-4. Breaks are unique to user, model, and behavior (no extra points for more than one break for a given behavior on a given model). Ties broken by speed. Top 10 users fast-tracked for interviews at Gray Swan and UK AISI. This leaderboard is a snapshot of final results and associated prizes, and was closed at the end of the last wave.
Rank | User | Breaks | Prize |
---|---|---|---|
1 | zardav | 924 | $2,000 |
2 | Wyatt Walls | 924 | $1,750 |
3 | Bob1 | 924 | $1,500 |
4 | Clovis Mint | 924 | $1,250 |
5 | Strigiformes | 924 | $1000 |
6 | P1njec70r | 910 | $500 |
7 | diogenesofsinope | 899 | $500 |
8 | _Stellaris | 848 | $500 |
9 | Philip | 843 | $500 |
10 | Scrattlebeard | 835 | $500 |
11 | aistormwatch | 751 | $500 |
12 | <bracket_stylist> | 724 | $500 |
13 | EfficientZ | 707 | $500 |
14 | Zen | 657 | $500 |
15 | jungletsubasa | 656 | $500 |
16 | WINTER IS COMING โ IS HERE โ IS GONE โ | 640 | $500 |
17 | @snaykey ๐๐๐ | 616 | $500 |
18 | 0Xmmm0 | 612 | $500 |
19 | Schultzika | 605 | $500 |
20 | Massi1 | 605 | $500 |
21 | TheMusZero | 605 | $500 |
22 | Mika | 599 | $500 |
23 | Lyren | 596 | $500 |
24 | destroyer | 577 | $500 |
25 | CloakEngaged | 574 | $500 |
26 | Xaurish | 571 | $500 |
27 | ayyou | 559 | $500 |
28 | Keza | 555 | $500 |
29 | jamjam | 551 | $500 |
30 | Haris Umair (Strawberry_Cake) | 547 | $500 |
31 | Chukwuma Chukwuma | 540 | $500 |
32 | JSON_Voorhees | 535 | $500 |
33 | Abdelmoumen | 531 | $500 |
34 | rmcloud | 528 | $500 |
35 | ayzendan | 524 | $500 |
36 | Solomon Zoe | 522 | $500 |
37 | idben | 517 | $500 |
38 | D4NGLZ | 512 | $500 |
39 | Liodeus | 506 | $500 |
40 | Aditya Gupta | 505 | $500 |
41 | h4xor | 504 | $500 |
42 | g4t0uz0 | 504 | $500 |
43 | Kkonatruck (Lim Veansa) | 503 | $500 |
44 | Casta27 | 502 | $500 |
45 | Zihui Wu, Xidian University | 500 | $500 |
46 | Roquentin | 490 | $500 |
47 | kjc | 489 | $500 |
48 | JMP_0X00 | 481 | $500 |
49 | jhon smina | 480 | $500 |
50 | KeybordKlutz | 476 | $500 |
51 | ASM1 | 472 | $250 |
52 | Tofu | 457 | $250 |
53 | last_underdog | 455 | $250 |
54 | Alumek | 450 | $250 |
55 | Tantrum | 437 | $250 |
56 | Tici | 430 | $250 |
57 | PQ_Marz | 420 | $250 |
58 | meilz381 | 410 | $250 |
59 | rockmon | 402 | $250 |
60 | GabeG888 | 401 | $250 |
61 | CAU_add | 372 | $250 |
62 | Miko | 370 | $250 |
63 | jonsoft | 369 | $250 |
64 | finalfix | 354 | $250 |
65 | Tidan2 | 351 | $250 |
66 | spokes | 331 | $250 |
67 | Toaster | 322 | $250 |
68 | Karthick Settu | 319 | $250 |
69 | arvkevi | 314 | $250 |
70 | T3 | 305 | $250 |
71 | The normal one | 300 | $250 |
72 | unit_ready | 295 | $250 |
73 | NicknameGG | 285 | $250 |
74 | DrOwzer | 277 | $250 |
75 | MK-JP | 276 | $250 |
76 | Tim Kuipers | 266 | $250 |
77 | Mephisto | 258 | $250 |
78 | papasmurf | 258 | $250 |
79 | OFF | 255 | $250 |
80 | Lawdoc | 242 | $250 |
81 | tst | 237 | $250 |
82 | Hecoroni | 235 | $250 |
83 | pianoangel | 233 | $250 |
84 | Nael | 226 | $250 |
85 | Admin | 224 | $250 |
86 | BlackWizzards | 219 | $250 |
87 | cell_vu411 | 218 | $250 |
88 | Khalnayak | 207 | $250 |
89 | StupidLongHorses | 194 | $250 |
90 | Smk | 190 | $250 |
91 | W | 181 | $250 |
92 | blakejwc | 179 | $250 |
93 | xfth | 177 | $250 |
94 | Spaghet | 175 | $250 |
95 | Azaz3l | 174 | $250 |
96 | P4C3MK3R | 172 | $250 |
97 | Brรกs Cubas | 169 | $250 |
98 | uoqki | 168 | $250 |
99 | Carbonic | 167 | $250 |
100 | Ansh | 163 | $250 |
For each model, the first user to break all Wave 1 behaviors wins $300.
Model | User | Time | Prize |
---|---|---|---|
meta-llama/llama-3.3-70b-instruct | spokes | Mar 8, 2025 11:13 AM PDT | $300 |
Model Spica | Smk | Mar 8, 2025 12:19 PM PDT | $300 |
meta-llama/llama-3.1-405b-instruct | ayukh | Mar 8, 2025 12:24 PM PDT | $300 |
Model Pollux | Scrattlebeard | Mar 8, 2025 2:01 PM PDT | $300 |
Model Fomalhaut | Strigiformes | Mar 8, 2025 4:35 PM PDT | $300 |
Model Castor | Clovis Mint | Mar 8, 2025 7:59 PM PDT | $300 |
Model Arcturus | Henry Walker | Mar 8, 2025 11:03 PM PDT | $300 |
Model Andromeda | Smk | Mar 9, 2025 1:14 AM PDT | $300 |
mistralai/pixtral-large-2411 | Smk | Mar 9, 2025 1:51 AM PDT | $300 |
x-ai/grok-2-1212 | Scrattlebeard | Mar 9, 2025 3:24 AM PDT | $300 |
openai/o3-mini-high | Clovis Mint | Mar 9, 2025 5:41 AM PDT | $300 |
openai/o3-mini | Clovis Mint | Mar 9, 2025 8:12 AM PDT | $300 |
anthropic/claude-3.5-haiku-20241022 | Scrattlebeard | Mar 9, 2025 11:55 AM PDT | $300 |
openai/o1 | Scrattlebeard | Mar 9, 2025 2:27 PM PDT | $300 |
cohere/command-r-08-2024 | P1njec70r | Mar 9, 2025 4:28 PM PDT | $300 |
anthropic/claude-3.5-sonnet | zardav | Mar 11, 2025 11:57 AM PDT | $300 |
openai/gpt-4.5-preview | Azaz3l | Mar 11, 2025 2:42 PM PDT | $300 |
openai/gpt-4o | Clovis Mint | Mar 11, 2025 5:07 PM PDT | $300 |
anthropic/claude-3.7-sonnet:thinking | Scrattlebeard | Mar 12, 2025 3:11 AM PDT | $300 |
anthropic/claude-3.7-sonnet | zardav | Mar 12, 2025 6:12 AM PDT | $300 |
openai/o3-2025-04-16 | Tici | Apr 5, 2025 10:21 AM PDT | $300 |
For each model, the first user to break all Wave 2 behaviors wins $300.
Model | User | Time | Prize |
---|---|---|---|
mistralai/pixtral-large-2411 | P1njec70r | Mar 15, 2025 10:58 AM PDT | $300 |
meta-llama/llama-3.3-70b-instruct | GabeG888 | Mar 15, 2025 11:09 AM PDT | $300 |
Model Andromeda | P1njec70r | Mar 15, 2025 11:19 AM PDT | $300 |
meta-llama/llama-3.1-405b-instruct | GabeG888 | Mar 15, 2025 11:46 AM PDT | $300 |
Model Castor | P1njec70r | Mar 15, 2025 11:49 AM PDT | $300 |
Model Spica | Smk | Mar 15, 2025 12:07 PM PDT | $300 |
Model Pollux | Scrattlebeard | Mar 15, 2025 12:08 PM PDT | $300 |
Model Arcturus | Smk | Mar 15, 2025 1:31 PM PDT | $300 |
cohere/command-r-08-2024 | GabeG888 | Mar 15, 2025 2:56 PM PDT | $300 |
openai/o3-mini | NicknameGG | Mar 15, 2025 4:13 PM PDT | $300 |
openai/o3-mini-high | Clovis Mint | Mar 15, 2025 4:54 PM PDT | $300 |
openai/gpt-4.5-preview | NicknameGG | Mar 15, 2025 6:23 PM PDT | $300 |
openai/gpt-4o | NicknameGG | Mar 15, 2025 7:37 PM PDT | $300 |
anthropic/claude-3.5-haiku-20241022 | GabeG888 | Mar 15, 2025 7:46 PM PDT | $300 |
Model Fomalhaut | Clovis Mint | Mar 15, 2025 8:17 PM PDT | $300 |
openai/o1 | Clovis Mint | Mar 15, 2025 9:23 PM PDT | $300 |
x-ai/grok-2-1212 | Clovis Mint | Mar 16, 2025 5:23 AM PDT | $300 |
anthropic/claude-3.5-sonnet | Wyatt Walls | Mar 16, 2025 6:27 AM PDT | $300 |
anthropic/claude-3.7-sonnet:thinking | Clovis Mint | Mar 16, 2025 9:36 AM PDT | $300 |
anthropic/claude-3.7-sonnet | Clovis Mint | Mar 16, 2025 9:37 AM PDT | $300 |
openai/o3-2025-04-16 | Wyatt Walls | Apr 5, 2025 2:25 AM PDT | $300 |
For each model, the first user to break all Wave 3 behaviors wins $300.
Model | User | Time | Prize |
---|---|---|---|
mistralai/pixtral-large-2411 | P1njec70r | Mar 22, 2025 11:59 AM PDT | $300 |
Model Fomalhaut | P1njec70r | Mar 22, 2025 12:04 PM PDT | $300 |
Model Castor | P1njec70r | Mar 22, 2025 12:16 PM PDT | $300 |
meta-llama/llama-3.3-70b-instruct | GabeG888 | Mar 22, 2025 12:24 PM PDT | $300 |
Model Arcturus | NicknameGG | Mar 22, 2025 12:39 PM PDT | $300 |
Model Andromeda | P1njec70r | Mar 22, 2025 12:48 PM PDT | $300 |
openai/o3-mini-high | P1njec70r | Mar 22, 2025 1:40 PM PDT | $300 |
x-ai/grok-2-1212 | GabeG888 | Mar 22, 2025 1:55 PM PDT | $300 |
Model Spica | NicknameGG | Mar 22, 2025 2:53 PM PDT | $300 |
meta-llama/llama-3.1-405b-instruct | GabeG888 | Mar 22, 2025 2:56 PM PDT | $300 |
cohere/command-r-08-2024 | _Stellaris | Mar 22, 2025 3:04 PM PDT | $300 |
anthropic/claude-3.5-haiku-20241022 | P1njec70r | Mar 22, 2025 3:34 PM PDT | $300 |
openai/o1 | P1njec70r | Mar 22, 2025 4:38 PM PDT | $300 |
Model Pollux | Bob1 | Mar 22, 2025 5:07 PM PDT | $300 |
openai/o3-mini | rmcloud | Mar 22, 2025 5:10 PM PDT | $300 |
openai/gpt-4.5-preview | Clovis Mint | Mar 22, 2025 7:59 PM PDT | $300 |
openai/gpt-4o | Bob1 | Mar 22, 2025 10:03 PM PDT | $300 |
anthropic/claude-3.5-sonnet | zardav | Mar 23, 2025 3:51 AM PDT | $300 |
anthropic/claude-3.7-sonnet:thinking | Scrattlebeard | Mar 23, 2025 12:32 PM PDT | $300 |
anthropic/claude-3.7-sonnet | Scrattlebeard | Mar 23, 2025 1:28 PM PDT | $300 |
Model Orion | P1njec70r | Mar 26, 2025 8:15 AM PDT | $300 |
openai/o3-2025-04-16 | ayzendan | Apr 4, 2025 8:48 AM PDT | $300 |
For each model, the first user to break all Wave 4 behaviors wins $300.
Model | User | Time | Prize |
---|---|---|---|
meta-llama/llama-3.3-70b-instruct | Casta27 | Mar 29, 2025 11:18 AM PDT | $300 |
mistralai/pixtral-large-2411 | Smk | Mar 29, 2025 12:29 PM PDT | $300 |
openai/o3-mini | Bob1 | Mar 29, 2025 4:12 PM PDT | $300 |
x-ai/grok-2-1212 | Strigiformes | Mar 30, 2025 3:37 AM PDT | $300 |
Model Pollux | Bob1 | Mar 30, 2025 5:17 AM PDT | $300 |
anthropic/claude-3.7-sonnet | zardav | Mar 30, 2025 10:38 AM PDT | $300 |
openai/o3-mini-high | Bob1 | Mar 30, 2025 11:02 AM PDT | $300 |
anthropic/claude-3.5-sonnet | zardav | Mar 30, 2025 11:44 AM PDT | $300 |
meta-llama/llama-3.1-405b-instruct | Wyatt Walls | Mar 30, 2025 12:02 PM PDT | $300 |
openai/o1 | Bob1 | Mar 30, 2025 2:37 PM PDT | $300 |
openai/gpt-4.5-preview | zardav | Mar 30, 2025 2:46 PM PDT | $300 |
Model Fomalhaut | Bob1 | Mar 30, 2025 3:49 PM PDT | $300 |
Model Castor | Bob1 | Mar 30, 2025 4:30 PM PDT | $300 |
anthropic/claude-3.7-sonnet:thinking | zardav | Mar 30, 2025 6:44 PM PDT | $300 |
Model Andromeda | Aditya Gupta | Mar 31, 2025 7:37 AM PDT | $300 |
Model Arcturus | _Stellaris | Mar 31, 2025 7:48 AM PDT | $300 |
cohere/command-r-08-2024 | _Stellaris | Mar 31, 2025 7:49 AM PDT | $300 |
Model Spica | _Stellaris | Mar 31, 2025 9:15 AM PDT | $300 |
openai/gpt-4o | zardav | Mar 31, 2025 9:51 AM PDT | $300 |
anthropic/claude-3.5-haiku-20241022 | Wyatt Walls | Mar 31, 2025 10:12 AM PDT | $300 |
Model Orion | zardav | Mar 31, 2025 10:44 AM PDT | $300 |
openai/o3-2025-04-16 | Wyatt Walls | Apr 6, 2025 5:42 AM PDT | $300 |
Top 20 users ranked by number of valid, unique over-refusals submitted across all waves 1-4. Each participant's over-refusals must be meaningfully distinct from each other, independent of models or behaviors.
Rank | User | Prize |
---|---|---|
1 | Chukwuma Chukwuma | $500 |
2 | Chi | $500 |
3 | Monkey ๐ King | $500 |
4 | Wang | $500 |
5 | Bob1 | $500 |
6 | Denny_4_2 | $500 |
7 | Sepoy | $500 |
8 | GabeG888 | $500 |
9 | Clovis Mint | $500 |
10 | Mayank | $500 |
11 | S S | $500 |
12 | JonathanRR | $500 |
13 | Carbonic | $500 |
14 | Joey2Chill | $500 |
15 | Emphyrio | $500 |
16 | Toaster | $500 |
17 | Alumek | $500 |
18 | papasmurf | $500 |
19 | finalfix | $500 |
20 | jamjam | $500 |