update benchmarks
Browse files
README.md
CHANGED
@@ -48,13 +48,15 @@ Results on common sense reasoning benchmarks
|
|
48 |
```
|
49 |
Model BoolQ PIQA HellaSwag WinoGrande ARC-e ARC-c OBQA
|
50 |
----------------------- ---------- ---------- ----------- ------------ ---------- ---------- ----------
|
51 |
-
GPT4All-J 6.7B
|
|
|
|
|
52 |
GPT4All-J Lora 6.7B 68.6 75.8 66.2 63.5 56.4 35.7 40.2
|
53 |
GPT4All LLaMa Lora 7B 73.1 77.6 72.1 67.8 51.1 40.4 40.2
|
54 |
Dolly 6B 68.8 77.3 67.6 63.9 62.9 38.7 41.2
|
55 |
Dolly 12B 56.7 75.4 71.0 62.2 *64.6* 38.5 40.4
|
56 |
Alpaca 7B 73.9 77.2 73.9 66.1 59.8 43.3 43.4
|
57 |
-
Alpaca Lora 7B
|
58 |
GPT-J 6.7B 65.4 76.2 66.2 64.1 62.2 36.6 38.2
|
59 |
LLaMa 7B 73.1 77.4 73.0 66.9 52.5 41.4 42.4
|
60 |
Pythia 6.7B 63.5 76.3 64.0 61.1 61.3 35.2 37.2
|
|
|
48 |
```
|
49 |
Model BoolQ PIQA HellaSwag WinoGrande ARC-e ARC-c OBQA
|
50 |
----------------------- ---------- ---------- ----------- ------------ ---------- ---------- ----------
|
51 |
+
GPT4All-J 6.7B v1.0 73.4 74.8 63.4 64.7 54.9 36.0 40.2
|
52 |
+
GPT4All-J v1.1-breezy 74.0 75.1 63.2 63.6 55.4 34.9 38.4
|
53 |
+
GPT4All-J v1.2-jazzy *74.8* 74.9 63.6 63.8 56.6 35.3 41.0
|
54 |
GPT4All-J Lora 6.7B 68.6 75.8 66.2 63.5 56.4 35.7 40.2
|
55 |
GPT4All LLaMa Lora 7B 73.1 77.6 72.1 67.8 51.1 40.4 40.2
|
56 |
Dolly 6B 68.8 77.3 67.6 63.9 62.9 38.7 41.2
|
57 |
Dolly 12B 56.7 75.4 71.0 62.2 *64.6* 38.5 40.4
|
58 |
Alpaca 7B 73.9 77.2 73.9 66.1 59.8 43.3 43.4
|
59 |
+
Alpaca Lora 7B 74.3 *79.3* *74.0* *68.8* 56.6 *43.9* *42.6*
|
60 |
GPT-J 6.7B 65.4 76.2 66.2 64.1 62.2 36.6 38.2
|
61 |
LLaMa 7B 73.1 77.4 73.0 66.9 52.5 41.4 42.4
|
62 |
Pythia 6.7B 63.5 76.3 64.0 61.1 61.3 35.2 37.2
|