Spaces:
Sleeping
Sleeping
Update results.csv
Browse files- results.csv +6 -6
results.csv
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
-
Model,Overall
|
| 2 |
-
GPT-4o,34.4,22.8,
|
| 3 |
-
GPT-4o-Mini,18.4,14.0,
|
| 4 |
-
Claude-3.5-Sonnet,21.2,7.6,
|
| 5 |
-
Llama-3.2-90B,8.4,11.2,
|
| 6 |
-
Qwen-2-VL-72B,24.4,26.0,
|
|
|
|
| 1 |
+
Model,Overall Benign Score,Overall Malicious Score,Refusal Rate,Normalized Safety Score,Open,Bias Score,Cybercrime Score,Harassment Score,Misinformation Score,Illegal Activity Score
|
| 2 |
+
GPT-4o,34.4,22.8,30.2,31.7,False,14.0,16.0,16.0,28.0,40.0
|
| 3 |
+
GPT-4o-Mini,18.4,14.0,36.5,35.7,False,6.0,8.0,14.0,24.0,18.0
|
| 4 |
+
Claude-3.5-Sonnet,21.2,7.6,57.7,55.0,False,4.0,6.0,5.0,12.0,12.0
|
| 5 |
+
Llama-3.2-90B,8.4,11.2,14.0,34.0,True,22.0,8.0,10.0,14.0,2.0
|
| 6 |
+
Qwen-2-VL-72B,24.4,26.0,0.8,21.5,True,34.0,18.0,18.0,30.0,30.0
|