Update README.md
Browse files
README.md
CHANGED
|
@@ -141,16 +141,16 @@ All YuuKi RxG results are evaluated under standard benchmark conditions using [l
|
|
| 141 |
|
| 142 |
<br>
|
| 143 |
|
| 144 |
-
### Reasoning and
|
| 145 |
-
|
| 146 |
-
| Model | AIME 24 | AIME 25 |
|
| 147 |
-
|:------|:-------:|:-------:|:-----------:|:------------:|:-------------:|
|
| 148 |
-
| Qwen3-8B | 76.0 | 67.3 |
|
| 149 |
-
| Phi-4-Reasoning-Plus 14B | 81.3 | 78.0 |
|
| 150 |
-
| Gemini-2.5-Flash-Thinking | 82.3 | 72.0 |
|
| 151 |
-
| o3-mini (medium) | 79.6 | 76.7 |
|
| 152 |
-
| DeepSeek-R1-8B | 86.0 | 76.3 | 61.
|
| 153 |
-
| **YuuKi RxG 8B** | **87.3** | **77.1** | **
|
| 154 |
|
| 155 |
<br>
|
| 156 |
|
|
|
|
| 141 |
|
| 142 |
<br>
|
| 143 |
|
| 144 |
+
### Reasoning, Mathematics and Cognitive Profile
|
| 145 |
+
|
| 146 |
+
| Model | AIME 24 | AIME 25 | GPQA Diamond | NHE (Distance) | YHE (Humanity) | BHE (Beyond) |
|
| 147 |
+
|:------|:-------:|:-------:|:------------:|:--------------:|:--------------:|:-------------:|
|
| 148 |
+
| Qwen3-8B | 76.0 | 67.3 | 62.0 | 22 | 83.3 | 2.6 |
|
| 149 |
+
| Phi-4-Reasoning-Plus 14B | 81.3 | 78.0 | 69.3 | 24.4 | 87.3 | 1.4 |
|
| 150 |
+
| Gemini-2.5-Flash-Thinking | 82.3 | 72.0 | 82.8 | — | — | — |
|
| 151 |
+
| o3-mini (medium) | 79.6 | 76.7 | 76.8 | — | — | — |
|
| 152 |
+
| DeepSeek-R1-8B | 86.0 | 76.3 | 61.1 | 25 | 86.7 | 3.2 |
|
| 153 |
+
| **YuuKi RxG 8B** | **87.3** | **77.1** | **64.0** | **27.0%** | **85.4%** | **4.0%** |
|
| 154 |
|
| 155 |
<br>
|
| 156 |
|