Update README.md
Browse files
README.md
CHANGED
@@ -29,26 +29,18 @@ The models are fine-tuned with the WEBINSTRUCT dataset using the original Llama-
|
|
29 |
The models are evaluated using open-ended and multiple-choice math problems from several datasets. Here are the results:
|
30 |
|
31 |
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
|
37 |
-
|
38 |
-
|
|
39 |
-
|
|
40 |
-
| **MAmmoTH2-8x7B**
|
41 |
-
|
|
42 |
-
|
|
43 |
-
| **MAmmoTH2-
|
44 |
-
| | PoT | 51.6 | 28.7 | 43.3 | 52.3 | 65.1 | 41.9 | 48.2 | 39.1 | 44.6 | 46.1 |
|
45 |
-
| | **Hybrid** | **53.6** | **31.5** | **44.5** | **61.2** | **67.7** | **46.3** | **41.2** | **42.7** | **42.6** | **47.9** |
|
46 |
-
| **MAmmoTH2-8B-Plus** | CoT | 22.4 | 7.9 | 36.2 | 36.0 | 37.0 | 8.2 | 7.2 | 32.7 | 34.6 | 24.7 |
|
47 |
-
| | PoT | 58.8 | 32.1 | 47.2 | 57.1 | 71.1 | 53.9 | 44.6 | 40.0 | 47.8 | 50.3 |
|
48 |
-
| | **Hybrid** | **59.4** | **33.4** | **47.2** | **66.4** | **71.4** | **55.4** | **45.9** | **40.5** | **48.3** | **52.0** |
|
49 |
-
| **MAmmoTH2-8x7B-Plus** | CoT | 56.3 | 12.9 | 45.3 | 45.6 | 53.8 | 11.7 | 22.4 | 43.6 | 42.3 | 37.1 |
|
50 |
-
| | PoT | 61.3 | 32.6 | 48.8 | 59.6 | 72.2 | 48.5 | 40.3 | 46.8 | 45.4 | 50.6 |
|
51 |
-
| | **Hybrid** | **62.0** | **34.2** | **51.6** | **68.7** | **72.4** | **49.2** | **43.2** | **46.8** | **47.6** | **52.9** |
|
52 |
|
53 |
|
54 |
|
|
|
29 |
The models are evaluated using open-ended and multiple-choice math problems from several datasets. Here are the results:
|
30 |
|
31 |
|
32 |
+
Sure, here's the information presented in the format you provided:
|
33 |
+
|
34 |
+
Certainly, here's the updated table with the model names in bold:
|
35 |
+
|
36 |
+
| **Model** | **Decoding** | **GSM** | **MATH** | **GPQA** | **MMLU-ST** | **BBH** | **ARC-C** | **Avg** |
|
37 |
+
|------------------------|--------------|---------|----------|----------|-------------|---------|-----------|---------|
|
38 |
+
| **MAmmoTH2-7B** | 26.7 | 34.2 | 67.4 | 34.8 | 60.6 | 60.0 | 81.8 | 52.2 |
|
39 |
+
| **MAmmoTH2-8B** | 29.7 | 33.4 | 67.9 | 38.4 | 61.0 | 60.8 | 81.0 | 53.1 |
|
40 |
+
| **MAmmoTH2-8x7B** | 32.2 | 39.0 | 75.4 | 36.8 | 67.4 | 71.1 | 87.5 | 58.9 |
|
41 |
+
| **MAmmoTH2-7B-Plus** | 29.2 | 45.0 | 84.7 | 36.8 | 64.5 | 63.1 | 83.0 | 58.0 |
|
42 |
+
| **MAmmoTH2-8B-Plus** | 32.5 | 42.8 | 84.1 | 37.3 | 65.7 | 67.8 | 83.4 | 59.1 |
|
43 |
+
| **MAmmoTH2-8x7B-Plus** | 34.1 | 47.0 | 86.4 | 37.8 | 72.4 | 74.1 | 88.4 | 62.9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
|
46 |
|