update results
Browse files
README.md
CHANGED
|
@@ -205,12 +205,12 @@ The model was evaluated on the ifeval, mmlu_pro and gsm8k_platinum using [lm-ev
|
|
| 205 |
|
| 206 |
### Accuracy
|
| 207 |
|
| 208 |
-
|
| 209 |
| Benchmark | inference-optimization/MiniMax-M2.5-BF16 | inference-optimization/MiniMax-M2.5.w8a8 | Recovery (%) |
|
| 210 |
-
|-----------|------------------------------------------|------------------------------------------
|
| 211 |
-
| GSM8k Platinum (0-shot) | 95.15 |
|
| 212 |
-
| IfEval (0-shot) |
|
| 213 |
-
| AIME 2025 | 87.50 |
|
| 214 |
| GPQA diamond | 83.67 | 84.51 | 101.01 |
|
| 215 |
-
| Math 500 | 87.33 | 87.
|
| 216 |
-
|
|
|
|
|
|
|
| 205 |
|
| 206 |
### Accuracy
|
| 207 |
|
|
|
|
| 208 |
| Benchmark | inference-optimization/MiniMax-M2.5-BF16 | inference-optimization/MiniMax-M2.5.w8a8 | Recovery (%) |
|
| 209 |
+
|-----------|------------------------------------------|------------------------------------------|--------------|
|
| 210 |
+
| GSM8k Platinum (0-shot) | 95.15 | 95.18 | 100.03 |
|
| 211 |
+
| IfEval (0-shot) | 92.05 | 90.33 | 98.13 |
|
| 212 |
+
| AIME 2025 | 87.50 | 88.33 | 100.95 |
|
| 213 |
| GPQA diamond | 83.67 | 84.51 | 101.01 |
|
| 214 |
+
| Math 500 | 87.33 | 87.13 | 99.77 |
|
| 215 |
+
| MMLU Pro Chat | 80.83 | 81.25 | 100.51 |
|
| 216 |
+
|