Update README.md
Browse files
README.md
CHANGED
@@ -120,13 +120,19 @@ We compare our results to the base Mistral-7B model (using LM Evaluation Harness
|
|
120 |
We find **129%** of the base model's performance on AGI Eval, averaging **0.397**.
|
121 |
As well, we significantly improve upon the official `mistralai/Mistral-7B-Instruct-v0.1` finetuning, achieving **119%** of their performance.
|
122 |
|
123 |
-
![
|
124 |
|
125 |
## BigBench-Hard Performance
|
126 |
|
127 |
We find **119%** of the base model's performance on BigBench-Hard, averaging **0.416**.
|
128 |
|
129 |
-
![
|
|
|
|
|
|
|
|
|
|
|
|
|
130 |
|
131 |
|
132 |
# Dataset
|
|
|
120 |
We find **129%** of the base model's performance on AGI Eval, averaging **0.397**.
|
121 |
As well, we significantly improve upon the official `mistralai/Mistral-7B-Instruct-v0.1` finetuning, achieving **119%** of their performance.
|
122 |
|
123 |
+
![AGIEval Performance](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca/resolve/main/Images/MistralOrca7BAGIEval.png "AGIEval Performance")
|
124 |
|
125 |
## BigBench-Hard Performance
|
126 |
|
127 |
We find **119%** of the base model's performance on BigBench-Hard, averaging **0.416**.
|
128 |
|
129 |
+
![BigBench-Hard Performance](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca/resolve/main/Images/MistralOrca7BBigBenchHard.png "BigBench-Hard Performance")
|
130 |
+
|
131 |
+
## GPT4ALL Leaderboard Performance
|
132 |
+
|
133 |
+
We gain a slight edge over our previous releases, again topping the leaderboard, averaging **72.38**.
|
134 |
+
|
135 |
+
![GPT4ALL Performance](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca/resolve/main/Images/MistralOrca7BGPT4ALL.png "GPT4ALL Performance")
|
136 |
|
137 |
|
138 |
# Dataset
|