wenhu commited on
Commit
b5698a9
1 Parent(s): 5e63f99

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -34,10 +34,10 @@ The models are evaluated using open-ended and multiple-choice math problems from
34
  | **Model** | **TheoremQA** | **MATH** | **GSM8K** | **GPQA** | **MMLU-ST** | **BBH** | **ARC-C** | **Avg** |
35
  |:---------------------------------------|:--------------|:---------|:----------|:---------|:------------|:--------|:----------|:--------|
36
  | **MAmmoTH2-7B** (Updated) | 29.0 | 36.7 | 68.4 | 32.4 | 62.4 | 58.6 | 81.7 | 52.7 |
37
- | **MAmmoTH2-8B** | 30.3 | 35.8 | 70.4 | 35.2 | 64.2 | 62.1 | 82.2 | 54.3 |
38
  | **MAmmoTH2-8x7B** | 32.2 | 39.0 | 75.4 | 36.8 | 67.4 | 71.1 | 87.5 | 58.9 |
39
- | **MAmmoTH2-7B-Plus** | 31.2 | 46.0 | 84.6 | 33.8 | 63.8 | 63.3 | 84.4 | 58.1 |
40
- | **MAmmoTH2-8B-Plus** | 31.5 | 43.0 | 85.2 | 35.8 | 66.7 | 69.7 | 84.3 | 59.4 |
41
  | **MAmmoTH2-8x7B-Plus** | 34.1 | 47.0 | 86.4 | 37.8 | 72.4 | 74.1 | 88.4 | 62.9 |
42
 
43
  To reproduce our results, please refer to https://github.com/TIGER-AI-Lab/MAmmoTH2/tree/main/math_eval.
 
34
  | **Model** | **TheoremQA** | **MATH** | **GSM8K** | **GPQA** | **MMLU-ST** | **BBH** | **ARC-C** | **Avg** |
35
  |:---------------------------------------|:--------------|:---------|:----------|:---------|:------------|:--------|:----------|:--------|
36
  | **MAmmoTH2-7B** (Updated) | 29.0 | 36.7 | 68.4 | 32.4 | 62.4 | 58.6 | 81.7 | 52.7 |
37
+ | **MAmmoTH2-8B** (Updated) | 30.3 | 35.8 | 70.4 | 35.2 | 64.2 | 62.1 | 82.2 | 54.3 |
38
  | **MAmmoTH2-8x7B** | 32.2 | 39.0 | 75.4 | 36.8 | 67.4 | 71.1 | 87.5 | 58.9 |
39
+ | **MAmmoTH2-7B-Plus** (Updated) | 31.2 | 46.0 | 84.6 | 33.8 | 63.8 | 63.3 | 84.4 | 58.1 |
40
+ | **MAmmoTH2-8B-Plus** (Updated) | 31.5 | 43.0 | 85.2 | 35.8 | 66.7 | 69.7 | 84.3 | 59.4 |
41
  | **MAmmoTH2-8x7B-Plus** | 34.1 | 47.0 | 86.4 | 37.8 | 72.4 | 74.1 | 88.4 | 62.9 |
42
 
43
  To reproduce our results, please refer to https://github.com/TIGER-AI-Lab/MAmmoTH2/tree/main/math_eval.