TIGER-Lab
/

MAmmoTH2-8x7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

aaabiao commited on May 7, 2024

Commit

c73788e

·

verified ·

1 Parent(s): ffe24df

Update README.md

Files changed (1) hide show

README.md +8 -12

README.md CHANGED Viewed

@@ -29,18 +29,14 @@ The models are fine-tuned with the WEBINSTRUCT dataset using the original Llama-
 The models are evaluated using open-ended and multiple-choice math problems from several datasets. Here are the results:
-Sure, here's the information presented in the format you provided:
-Certainly, here's the updated table with the model names in bold:
-| **Model**              | **Decoding** | **GSM** | **MATH** | **GPQA** | **MMLU-ST** | **BBH** | **ARC-C** | **Avg** |
-|------------------------|--------------|---------|----------|----------|-------------|---------|-----------|---------|
-| **MAmmoTH2-7B**        | 26.7         | 34.2    | 67.4     | 34.8     | 60.6        | 60.0    | 81.8      | 52.2    |
-| **MAmmoTH2-8B**        | 29.7         | 33.4    | 67.9     | 38.4     | 61.0        | 60.8    | 81.0      | 53.1    |
-| **MAmmoTH2-8x7B**      | 32.2         | 39.0    | 75.4     | 36.8     | 67.4        | 71.1    | 87.5      | 58.9    |
-| **MAmmoTH2-7B-Plus**   | 29.2         | 45.0    | 84.7     | 36.8     | 64.5        | 63.1    | 83.0      | 58.0    |
-| **MAmmoTH2-8B-Plus**   | 32.5         | 42.8    | 84.1     | 37.3     | 65.7        | 67.8    | 83.4      | 59.1    |
-| **MAmmoTH2-8x7B-Plus** | 34.1         | 47.0    | 86.4     | 37.8     | 72.4        | 74.1    | 88.4      | 62.9    |

 The models are evaluated using open-ended and multiple-choice math problems from several datasets. Here are the results:
+| **Model**              | **TheoremQA** | **MATH** | **GSM8K** | **GPQA** | **MMLU-ST** | **BBH** | **ARC-C** | **Avg** |
+|------------------------|---------------|----------|-----------|----------|-------------|---------|-----------|---------|
+| **MAmmoTH2-7B**        | 26.7          | 34.2     | 67.4      | 34.8     | 60.6        | 60.0    | 81.8      | 52.2    |
+| **MAmmoTH2-8B**        | 29.7          | 33.4     | 67.9      | 38.4     | 61.0        | 60.8    | 81.0      | 53.1    |
+| **MAmmoTH2-8x7B**      | 32.2          | 39.0     | 75.4      | 36.8     | 67.4        | 71.1    | 87.5      | 58.9    |
+| **MAmmoTH2-7B-Plus**   | 29.2          | 45.0     | 84.7      | 36.8     | 64.5        | 63.1    | 83.0      | 58.0    |
+| **MAmmoTH2-8B-Plus**   | 32.5          | 42.8     | 84.1      | 37.3     | 65.7        | 67.8    | 83.4      | 59.1    |
+| **MAmmoTH2-8x7B-Plus** | 34.1          | 47.0     | 86.4      | 37.8     | 72.4        | 74.1    | 88.4      | 62.9    |