blair-johnson commited on
Commit
e9839ea
·
1 Parent(s): 409bd1a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -107,6 +107,15 @@ GALACTICA 30B Evol-Instruct was fine-tuned in 196 hours using 16 A100 80GB GPUs,
107
 
108
  ## Performance and Limitations
109
 
 
 
 
 
 
 
 
 
 
110
  Qualitative evaluation suggests that the evol-instruct-70k fine-tuned Galactica models are signficantly more controllable and attentive to user prompts than the Alpaca fine-tuned GALPACA models.
111
 
112
  ## Works Cited
 
107
 
108
  ## Performance and Limitations
109
 
110
+ Common benchmark scores generated using the [Eleuther AI LLM Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness/tree/master).
111
+
112
+ | Task | Version | Metric | Value | Stderr |
113
+ |------|---------|--------|-------|--------|
114
+ | arc_challenge 25-shot | 0 | acc | 0.4684 | 0.146 |
115
+ | | | acc_norm | 0.4787 | 0.146 |
116
+ |hellaswag 10-shot| 0 | acc | 0.4705 | 0.0050 |
117
+ | | | acc_norm | 0.6111 | 0.0049 |
118
+
119
  Qualitative evaluation suggests that the evol-instruct-70k fine-tuned Galactica models are signficantly more controllable and attentive to user prompts than the Alpaca fine-tuned GALPACA models.
120
 
121
  ## Works Cited