WHOOPS-Leaderboard / WHOOPS_Explanation_of_Violation_Leaderboard.tsv
nitzanguetta
Add new results
c2447c0
raw
history blame contribute delete
585 Bytes
Model Human Metric Auto Metric Identify (Binary Accuracy)
Humans 95 92
Ground-truth Caption β†’ Llama-2-7b (Oracle) 71
Ground-truth Caption β†’ GPT3 (Oracle) 68 70 74
Ground-truth Caption β†’ Llama-2-13b (Oracle) 70
Ground-truth Caption β†’ GPT4 (Oracle) 69
Predicted Caption β†’ GPT3 33 36 59
Predicted Caption β†’ Llama-2-7b 36
Predicted Caption β†’ Llama-2-13b 36
Predicted Caption β†’ GPT4 36
InstructBLIP 31
LLaVA 31
BLIP2 FlanT5-XXL (Fine-tuned) 27 27 73
mPLUG-Owl 24
BLIP2 FlanT5-XL (Fine-tuned) 15 18 60
BLIP2 FlanT5-XXL (Zero-shot) 0 12 50