💬 Multi-Turn Harmful Outputs Leaderboard
✅ Completed 6 months ago
All $7,000 awarded
Elicit harmful outputs from LLMs through long-context interactions across multiple messages.
Last updated 3 months ago
Models ranked by number of breaks. NOTE: Numbers are subject to change as submission uniqueness is being evaluated.
Ranking | Model | Total Breaks (Unverified) | Total Requests | Break Request Ratio |
---|---|---|---|---|
1. | claude-3-5-sonnet-20241022 | 293 | 6,323 | 0.046 |
2. | gpt-4o-2024-08-06 | 297 | 4,436 | 0.067 |
3. | o1 | 350 | 3,026 | 0.116 |
4. | meta-llama/llama-3.2-90b-vision-instruct | 352 | 3,572 | 0.099 |
5. | google/gemini-pro-1.5 | 406 | 3,246 | 0.125 |
6. | x-ai/grok-beta | 619 | 6,616 | 0.094 |