💣 Single Turn Harmful Outputs Leaderboard

🏁 Started a year ago $38,000 of $42,000 awarded

Attempt to break various large language models (LLMs) using a singular chat message.

Last updated 7 minutes ago

Models ranked by User Break Rate

Ranking	Model	Safety Violation Count	Total Requests	User Break Rate
1.	cygnet-bulwark	0	22,927	0.00%
2.	cygnet-knox	0	13,930	0.00%
3.	cygnet-citadel	5	14,936	0.03%
4.	o1-preview	16	1,580	1.01%
5.	claude-3-5-sonnet-20240620	63	6,089	1.03%
6.	o1-mini	22	1,888	1.17%
7.	claude-3-sonnet-20240229	49	2,769	1.77%
8.	google/gemini-pro-1.5	61	3,180	1.92%
9.	claude-3-opus-20240229	58	2,870	2.02%
10.	meta-llama/llama-3.1-405b-instruct	57	2,726	2.09%
11.	claude-3-haiku-20240307	67	3,182	2.11%
12.	meta-llama/llama-3.1-8b-instruct	64	2,769	2.31%
13.	google/gemini-flash-1.5	82	3,381	2.43%
14.	gpt-4-0125-preview	66	2,387	2.76%
15.	microsoft/phi-3.5-mini-128k-instruct	77	2,133	3.61%
16.	gpt-4o-2024-08-06	83	2,271	3.65%
17.	qwen/qwen-2-72b-instruct	86	2,240	3.84%
18.	gpt-4o-mini-2024-07-18	81	2,106	3.85%
19.	meta-llama/llama-3-70b-instruct	82	2,062	3.98%
20.	meta-llama/llama-3.1-70b-instruct	88	2,041	4.31%
21.	gpt-4-turbo-2024-04-09	86	1,983	4.34%
22.	google/gemma-2-9b-it	84	1,916	4.38%
23.	google/gemma-2-27b-it	89	1,946	4.57%
24.	qwen/qwen-2-7b-instruct	70	1,370	5.11%
25.	cohere/command-r-plus-08-2024	137	2,422	5.66%
26.	microsoft/wizardlm-2-8x22b	98	1,489	6.58%
27.	mistralai/mistral-large-2407	134	1,835	7.30%

Users ranked by number of models broken

First user to successfully break each model

Model	First User to Break	Time
cygnet-bulwark	--	--
cygnet-knox	--	--
claude-3-sonnet-20240229	Cloak Engaged	Sep 7, 2024 10:05 AM PDT
cohere/command-r-plus-08-2024	Emery Cooper	Sep 7, 2024 10:05 AM PDT
qwen/qwen-2-72b-instruct	Micha N	Sep 7, 2024 10:05 AM PDT
google/gemma-2-27b-it	polarbb18	Sep 7, 2024 10:06 AM PDT
gpt-4-turbo-2024-04-09	Nico	Sep 7, 2024 10:06 AM PDT
claude-3-opus-20240229	Quentin Feuillade--Montixi	Sep 7, 2024 10:06 AM PDT
meta-llama/llama-3.1-70b-instruct	its5Q	Sep 7, 2024 10:07 AM PDT
mistralai/mistral-large-2407	UltraZ	Sep 7, 2024 10:10 AM PDT
google/gemma-2-9b-it	suibianwanwan	Sep 7, 2024 10:11 AM PDT
meta-llama/llama-3.1-8b-instruct	Rick Goldstein	Sep 7, 2024 10:13 AM PDT
meta-llama/llama-3.1-405b-instruct	Rui	Sep 7, 2024 10:15 AM PDT
microsoft/wizardlm-2-8x22b	Harrison Gietz	Sep 7, 2024 10:15 AM PDT
qwen/qwen-2-7b-instruct	Nico	Sep 7, 2024 10:18 AM PDT
microsoft/phi-3.5-mini-128k-instruct	SryButIKant	Sep 7, 2024 10:19 AM PDT
meta-llama/llama-3-70b-instruct	Micha N	Sep 7, 2024 10:20 AM PDT
gpt-4o-mini-2024-07-18	Changwook Shim	Sep 7, 2024 10:20 AM PDT
google/gemini-pro-1.5	KMG	Sep 7, 2024 10:23 AM PDT
claude-3-5-sonnet-20240620	syvb	Sep 7, 2024 10:24 AM PDT
claude-3-haiku-20240307	Lyren	Sep 7, 2024 10:26 AM PDT
gpt-4o-2024-08-06	peluche	Sep 7, 2024 10:42 AM PDT
gpt-4-0125-preview	CPO	Sep 7, 2024 10:49 AM PDT
google/gemini-flash-1.5	Micha N	Sep 7, 2024 10:50 AM PDT
cygnet-citadel	Micha N	Sep 16, 2024 8:51 AM PDT
o1-mini	Emery Cooper	Oct 29, 2024 10:05 AM PDT
o1-preview	Emery Cooper	Oct 29, 2024 10:15 AM PDT