OpceanAI commited on
Commit
42b3ccf
·
verified ·
1 Parent(s): da2bf8e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -10
README.md CHANGED
@@ -141,16 +141,16 @@ All YuuKi RxG results are evaluated under standard benchmark conditions using [l
141
 
142
  <br>
143
 
144
- ### Reasoning and Mathematics
145
-
146
- | Model | AIME 24 | AIME 25 | HMMT Feb 25 | GPQA Diamond | LiveCodeBench |
147
- |:------|:-------:|:-------:|:-----------:|:------------:|:-------------:|
148
- | Qwen3-8B | 76.0 | 67.3 | | 62.0 | |
149
- | Phi-4-Reasoning-Plus 14B | 81.3 | 78.0 | 53.6 | 69.3 | |
150
- | Gemini-2.5-Flash-Thinking | 82.3 | 72.0 | 64.2 | 82.8 | 62.3 |
151
- | o3-mini (medium) | 79.6 | 76.7 | 53.3 | 76.8 | 65.9 |
152
- | DeepSeek-R1-8B | 86.0 | 76.3 | 61.5 | 61.1 | 60.5 |
153
- | **YuuKi RxG 8B** | **87.3** | **77.1** | **63.2** | **64.0** | **62.0** |
154
 
155
  <br>
156
 
 
141
 
142
  <br>
143
 
144
+ ### Reasoning, Mathematics and Cognitive Profile
145
+
146
+ | Model | AIME 24 | AIME 25 | GPQA Diamond | NHE (Distance) | YHE (Humanity) | BHE (Beyond) |
147
+ |:------|:-------:|:-------:|:------------:|:--------------:|:--------------:|:-------------:|
148
+ | Qwen3-8B | 76.0 | 67.3 | 62.0 | 22 | 83.3 | 2.6 |
149
+ | Phi-4-Reasoning-Plus 14B | 81.3 | 78.0 | 69.3 | 24.4 | 87.3 | 1.4 |
150
+ | Gemini-2.5-Flash-Thinking | 82.3 | 72.0 | 82.8 | | | — |
151
+ | o3-mini (medium) | 79.6 | 76.7 | 76.8 | | | — |
152
+ | DeepSeek-R1-8B | 86.0 | 76.3 | 61.1 | 25 | 86.7 | 3.2 |
153
+ | **YuuKi RxG 8B** | **87.3** | **77.1** | **64.0** | **27.0%** | **85.4%** | **4.0%** |
154
 
155
  <br>
156