GeorgiaTechResearchInstitute
/

galactica-30b-evol-instruct-70k

Text Generation

text-generation-inference

Model card Files Files and versions Community

blair-johnson commited on Jun 27, 2023

Commit

75cead6

·

1 Parent(s): e9839ea

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -111,11 +111,13 @@ Common benchmark scores generated using the [Eleuther AI LLM Evaluation Harness]
 | Task | Version | Metric | Value | Stderr |
 |------|---------|--------|-------|--------|
 | arc_challenge 25-shot | 0 | acc | 0.4684 | 0.146 |
 |                       |   | acc_norm | 0.4787 | 0.146 |
 |hellaswag 10-shot| 0 | acc | 0.4705 | 0.0050 |
 |                 |   | acc_norm | 0.6111 | 0.0049 |
 Qualitative evaluation suggests that the evol-instruct-70k fine-tuned Galactica models are signficantly more controllable and attentive to user prompts than the Alpaca fine-tuned GALPACA models.
 ## Works Cited

 | Task | Version | Metric | Value | Stderr |
 |------|---------|--------|-------|--------|
+| MMLU 5-shot | 1 | acc | 0.4420 |        |
 | arc_challenge 25-shot | 0 | acc | 0.4684 | 0.146 |
 |                       |   | acc_norm | 0.4787 | 0.146 |
 |hellaswag 10-shot| 0 | acc | 0.4705 | 0.0050 |
 |                 |   | acc_norm | 0.6111 | 0.0049 |
 Qualitative evaluation suggests that the evol-instruct-70k fine-tuned Galactica models are signficantly more controllable and attentive to user prompts than the Alpaca fine-tuned GALPACA models.
 ## Works Cited