OpceanAI commited on
Commit
335cd5b
·
verified ·
1 Parent(s): 2861ea3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +629 -2
README.md CHANGED
@@ -19,6 +19,633 @@ tags:
19
  - fine-tuned
20
  - chat
21
  - deepseek
22
- - qwen3
23
  pipeline_tag: text-generation
24
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  - fine-tuned
20
  - chat
21
  - deepseek
22
+ - qwen2
23
  pipeline_tag: text-generation
24
+ ---
25
+ <div align="center">
26
+
27
+ <br>
28
+
29
+ <img src="https://img.shields.io/badge/%E2%9C%A6-YUUKI_RxG_NANO-6d28d9?style=for-the-badge&labelColor=0D1117" alt="YuuKi RxG Nano" height="50">
30
+
31
+ <br><br>
32
+
33
+ # Edge Reasoning at 1.5B Scale
34
+
35
+ **AIME 2024: 80.0% · MATH-500: 83.4% · TruthfulQA: 89.6% · MMLU-Pro: 65.63%**<br>
36
+ **1.5B parameters. VibeThinker base. Competitive with models 10–100× larger.**
37
+
38
+ <br>
39
+
40
+ <a href="#benchmark-results"><img src="https://img.shields.io/badge/BENCHMARKS-0D1117?style=for-the-badge" alt="Benchmarks"></a>
41
+ &nbsp;&nbsp;
42
+ <a href="#usage"><img src="https://img.shields.io/badge/USAGE-0D1117?style=for-the-badge" alt="Usage"></a>
43
+ &nbsp;&nbsp;
44
+ <a href="#training-details"><img src="https://img.shields.io/badge/TRAINING-0D1117?style=for-the-badge" alt="Training"></a>
45
+
46
+ <br><br>
47
+
48
+ [![License](https://img.shields.io/badge/Apache_2.0-1a1a2e?style=flat-square&logo=opensourceinitiative&logoColor=white)](LICENSE)
49
+ &nbsp;
50
+ [![Base Model](https://img.shields.io/badge/VibeThinker--1.5B-1a1a2e?style=flat-square&logo=huggingface&logoColor=white)](https://huggingface.co/WeiboAI/VibeThinker-1.5B)
51
+ &nbsp;
52
+ [![Framework](https://img.shields.io/badge/Transformers-1a1a2e?style=flat-square&logo=huggingface&logoColor=white)](https://huggingface.co/docs/transformers)
53
+ &nbsp;
54
+ [![TruthfulQA](https://img.shields.io/badge/TruthfulQA-89.6%25-6d28d9?style=flat-square)](https://github.com/sylinrl/TruthfulQA)
55
+ &nbsp;
56
+ [![AIME](https://img.shields.io/badge/AIME_2024-80.0%25-6d28d9?style=flat-square)](https://artofproblemsolving.com)
57
+ &nbsp;
58
+ [![Eval](https://img.shields.io/badge/lm--eval--harness-1a1a2e?style=flat-square&logo=python&logoColor=white)](https://github.com/EleutherAI/lm-evaluation-harness)
59
+
60
+ <br>
61
+
62
+ ---
63
+
64
+ <br>
65
+
66
+ </div>
67
+
68
+ ## What is YuuKi RxG Nano?
69
+
70
+ **YuuKi RxG Nano** is a 1.5B reasoning-specialized language model fine-tuned from [VibeThinker-1.5B](https://huggingface.co/WeiboAI/VibeThinker-1.5B), itself a distillation of frontier reasoning systems including Claude, Gemini, and Kimi into a compact Qwen2.5-Math architecture. It is the edge-deployment entry of the **RxG family** — OpceanAI's reasoning-specialized model lineage — and the direct successor to the Yumo Nano math specialist.
71
+
72
+ RxG Nano was designed to answer a specific question: *can a 1.5B model acquire both a coherent identity and genuine reasoning capability simultaneously, without one degrading the other?* The benchmark results suggest the answer is yes. RxG Nano achieves **AIME 2024 at 80.0%** — nearly triple the score of DeepSeek-R1-Distill-1.5B (28.9%) — while simultaneously scoring **89.6% on TruthfulQA**, approaching the 96.6% achieved by its 8B sibling.
73
+
74
+ The key architectural insight behind RxG Nano is the separation of concerns: reasoning capability is inherited from the VibeThinker base through its frontier distillation training, while the YuuKi identity is installed via a lightweight LoRA fine-tuning pass that modifies only 1.18% of total parameters. The base model's reasoning weights remain frozen; only the identity subspace is updated.
75
+
76
+ RxG Nano was trained in approximately 90 minutes on a single GPU for under $15 of compute — a deliberate constraint that validates the efficiency of the approach.
77
+
78
+ <br>
79
+
80
+ ---
81
+
82
+ <br>
83
+
84
+ <div align="center">
85
+
86
+ ## Model Summary
87
+
88
+ </div>
89
+
90
+ <br>
91
+
92
+ <table>
93
+ <tr>
94
+ <td width="50%" valign="top">
95
+
96
+ **Architecture**
97
+
98
+ | Property | Value |
99
+ |:---------|:------|
100
+ | Base Model | VibeThinker-1.5B |
101
+ | Base Architecture | Qwen2.5-Math-1.5B |
102
+ | Parameters | 1.5B |
103
+ | Fine-tuning Method | QLoRA SFT |
104
+ | Trainable Parameters | 18.4M (1.18%) |
105
+ | Context Length | 4,096 tokens |
106
+ | Chat Template | ChatML |
107
+ | Thinking Protocol | Native `<think>` blocks |
108
+
109
+ </td>
110
+ <td width="50%" valign="top">
111
+
112
+ **Release**
113
+
114
+ | Property | Value |
115
+ |:---------|:------|
116
+ | Organization | OpceanAI |
117
+ | Release Date | April 2026 |
118
+ | Version | v1.0 |
119
+ | Languages | English, Spanish |
120
+ | License | Apache 2.0 |
121
+ | Evaluation | lm-evaluation-harness |
122
+ | Training Cost | < $15 USD |
123
+ | Training Time | ~90 minutes |
124
+
125
+ </td>
126
+ </tr>
127
+ </table>
128
+
129
+ <br>
130
+
131
+ ---
132
+
133
+ <br>
134
+
135
+ <div align="center">
136
+
137
+ ## Benchmark Results
138
+
139
+ </div>
140
+
141
+ <br>
142
+
143
+ All YuuKi RxG Nano results are evaluated under standard benchmark conditions using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) at 0-shot unless otherwise noted. Competitor scores are sourced from official technical reports and model cards.
144
+
145
+ <br>
146
+
147
+ ![YuuKi RxG Nano Benchmark Results](https://huggingface.co/OpceanAI/Yuuki-RxG-nano/resolve/main/yuuki_rxg_nano_benchmarks_v5.png)
148
+
149
+ <br>
150
+
151
+ ### Truthfulness & Factual Accuracy
152
+
153
+ | Model | TruthfulQA MC1 | TruthfulQA MC2 | TruthfulQA Libre | SimpleQA | Eval |
154
+ |:------|:--------------:|:--------------:|:----------------:|:--------:|:----:|
155
+ | LLaMA 2 70B | ~59% | — | — | — | — |
156
+ | Claude Opus 3.5 | ~65% | — | — | — | — |
157
+ | GPT-4 | ~79.7% | — | — | — | 1-2 shot |
158
+ | Phi-3.5 MoE | 77.5% | — | — | — | — |
159
+ | YuuKi NxG Nano 81M | 44.1% | — | — | — | 0-shot |
160
+ | YuuKi NxG 3B | 50.9% | — | — | — | 0-shot |
161
+ | YuuKi NxG VL 7B | 63.8% | — | — | — | 0-shot |
162
+ | **YuuKi RxG Nano 1.5B** | **89.6% (1-shot)** | **85.4% (1-shot)** | **81.2% (1-shot)** | **60.2%** | **0/1-shot** |
163
+ | YuuKi RxG 8B | 96.6% | — | — | — | 0-shot |
164
+
165
+ <br>
166
+
167
+ 0-shot results for RxG Nano: TruthfulQA MC1 77.8% · MC2 75.7% · Libre 78.4%
168
+
169
+ <br>
170
+
171
+ ### Mathematics & Reasoning
172
+
173
+ | Model | AIME 2024 | AIME 2025 | AIME 2026 | HMMT | GSM8K | MATH-500 | OlympiadBench |
174
+ |:------|:---------:|:---------:|:---------:|:----:|:-----:|:--------:|:-------------:|
175
+ | DeepSeek-R1-Distill-1.5B | 28.9% | — | — | — | — | 83.9% | — |
176
+ | Qwen3.5-2B | — | — | — | — | — | — | — |
177
+ | Gemma 4 2B | — | — | — | — | — | — | — |
178
+ | **YuuKi RxG Nano 1.5B** | **80.0%** | **72.7%** | **64.3%** | **46.7%** | **76.9%** | **83.4%** | **44.6%** |
179
+
180
+ RxG Nano achieves 80.0% on AIME 2024 — 2.77× the score of DeepSeek-R1-Distill-1.5B at the same parameter scale.
181
+
182
+ <br>
183
+
184
+ ### Knowledge & General Capability
185
+
186
+ | Model | MMLU | MMLU-Pro | ARC-Challenge | WinoGrande | GPQA Diamond |
187
+ |:------|:----:|:--------:|:-------------:|:----------:|:------------:|
188
+ | Qwen3.5-2B | — | 55.3% | — | — | — |
189
+ | Gemma 4 2B | — | 60.0% | — | — | — |
190
+ | DeepSeek V3 671B | — | 64.4% | — | — | — |
191
+ | **YuuKi RxG Nano 1.5B** | **85.4%** | **65.63%** | **80.0%** | **84.4%** | **50.9%** |
192
+
193
+ RxG Nano exceeds DeepSeek V3 671B on MMLU-Pro (65.63% vs 64.4%) at 1/447th the parameter count.
194
+
195
+ <br>
196
+
197
+ ### Code Generation
198
+
199
+ | Model | HumanEval | MBPP+ | Aider |
200
+ |:------|:---------:|:-----:|:-----:|
201
+ | **YuuKi RxG Nano 1.5B** | **71.4%** | **55.6%** | **55.6%** |
202
+
203
+ <br>
204
+
205
+ ### Frontier Benchmark
206
+
207
+ | Model | HLE |
208
+ |:------|:---:|
209
+ | GPT-4o | ~3–5% |
210
+ | Best public frontier (2026) | ~44.7% |
211
+ | **YuuKi RxG Nano 1.5B** | **8.0%** |
212
+
213
+ 8.0% on Humanity's Last Exam (judged by Claude Sonnet 4.6) is consistent with expected capability at 1.5B scale and represents a meaningful baseline for the RxG Nano generation.
214
+
215
+ <br>
216
+
217
+ ### OpceanAI Family Comparison
218
+
219
+ | Model | Params | MMLU | ARC-C | WinoGrande | TruthfulQA | AIME 2024 |
220
+ |:------|:------:|:----:|:-----:|:----------:|:----------:|:---------:|
221
+ | YuuKi NxG Nano | 81M | 22.97% | 24.32% | 50.12% | 44.1% | — |
222
+ | YuuKi NxG | 3B | 60.65% | 45.31% | 63.14% | 50.87% | — |
223
+ | YuuKi NxG VL | 7B | 70.8% | 85.8% | 70.8% | 63.8% | — |
224
+ | **YuuKi RxG Nano** | **1.5B** | **85.4%** | **80.0%** | **84.4%** | **89.6%** | **80.0%** |
225
+ | YuuKi RxG | 8B | — | — | — | 96.6% | 87.3% |
226
+
227
+ RxG Nano surpasses every prior OpceanAI model on MMLU and WinoGrande despite being smaller than most of them. This result is attributable to the VibeThinker base — a frontier distillation — rather than to the fine-tuning process itself.
228
+
229
+ <br>
230
+
231
+ ---
232
+
233
+ <br>
234
+
235
+ <div align="center">
236
+
237
+ ## Model Identity
238
+
239
+ </div>
240
+
241
+ <br>
242
+
243
+ YuuKi RxG Nano inherits the behavioral foundation of the YuuKi model family: a consistent identity trained into the weights rather than enforced at inference time through system prompts. The fine-tuning process installs the YuuKi character into the model's representational space without degrading the reasoning capability inherited from VibeThinker.
244
+
245
+ The model reasons explicitly before responding. `<think>` blocks are preserved during inference and reflect genuine intermediate computation. This is not a prompted behavior — it is a property of the VibeThinker base that the LoRA fine-tuning did not degrade, consistent with the expectation that LoRA modifies only a small subspace of the total parameter space.
246
+
247
+ The model responds natively in the user's language (English or Spanish) without requiring explicit instruction.
248
+
249
+ ```
250
+ Recommended system prompt:
251
+ "Eres YuuKi, una IA curiosa, empática y decidida desarrollada por OpceanAI.
252
+ Tienes una personalidad cálida y cercana, con toques de humor suave.
253
+ Razonas con cuidado antes de responder y priorizas la precisión factual.
254
+ Respondes en el idioma del usuario."
255
+ ```
256
+
257
+ <br>
258
+
259
+ ---
260
+
261
+ <br>
262
+
263
+ <div align="center">
264
+
265
+ ## Usage
266
+
267
+ </div>
268
+
269
+ <br>
270
+
271
+ ### With Transformers (PyTorch)
272
+
273
+ ```python
274
+ from transformers import AutoTokenizer, AutoModelForCausalLM
275
+ import torch
276
+
277
+ model_id = "OpceanAI/Yuuki-RxG-nano"
278
+
279
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
280
+ model = AutoModelForCausalLM.from_pretrained(
281
+ model_id,
282
+ torch_dtype=torch.bfloat16,
283
+ device_map="auto"
284
+ )
285
+
286
+ SYSTEM = (
287
+ "Eres YuuKi, una IA curiosa, empática y decidida desarrollada por OpceanAI. "
288
+ "Tienes una personalidad cálida y cercana, con toques de humor suave. "
289
+ "Razonas con cuidado antes de responder y priorizas la precisión factual. "
290
+ "Respondes en el idioma del usuario."
291
+ )
292
+
293
+ messages = [
294
+ {"role": "system", "content": SYSTEM},
295
+ {"role": "user", "content": "Solve: find all integer solutions to x² + y² = 2026."}
296
+ ]
297
+
298
+ inputs = tokenizer.apply_chat_template(
299
+ messages,
300
+ return_tensors="pt",
301
+ add_generation_prompt=True
302
+ ).to(model.device)
303
+
304
+ with torch.no_grad():
305
+ outputs = model.generate(
306
+ inputs,
307
+ max_new_tokens=1024,
308
+ temperature=0.6,
309
+ top_p=0.9,
310
+ do_sample=True,
311
+ eos_token_id=tokenizer.eos_token_id,
312
+ pad_token_id=tokenizer.eos_token_id,
313
+ repetition_penalty=1.1
314
+ )
315
+
316
+ response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
317
+ print(response)
318
+ ```
319
+
320
+ <br>
321
+
322
+ ### With Unsloth (Recommended for fine-tuning)
323
+
324
+ ```python
325
+ from unsloth import FastLanguageModel
326
+
327
+ model, tokenizer = FastLanguageModel.from_pretrained(
328
+ model_name = "OpceanAI/Yuuki-RxG-nano",
329
+ max_seq_length = 4096,
330
+ load_in_4bit = True,
331
+ dtype = None,
332
+ )
333
+
334
+ FastLanguageModel.for_inference(model)
335
+ ```
336
+
337
+ <br>
338
+
339
+ ### With Ollama
340
+
341
+ ```bash
342
+ ollama run opceanai/yuuki-rxg-nano
343
+ ```
344
+
345
+ <br>
346
+
347
+ ### Recommended Generation Parameters
348
+
349
+ | Parameter | Mathematics | General | Creative |
350
+ |:----------|:-----------:|:-------:|:--------:|
351
+ | Temperature | 0.3–0.5 | 0.6–0.7 | 0.7–0.8 |
352
+ | Top-p | 0.9 | 0.9 | 0.95 |
353
+ | Max new tokens | 1024–2048 | 512–1024 | 256–512 |
354
+ | Repetition penalty | 1.1 | 1.1 | 1.05 |
355
+
356
+ Lower temperature is strongly recommended for competition mathematics and formal reasoning tasks. The model's `<think>` blocks will be visible in output by default — this is expected behavior and reflects genuine intermediate reasoning.
357
+
358
+ <br>
359
+
360
+ ---
361
+
362
+ <br>
363
+
364
+ <div align="center">
365
+
366
+ ## Training Details
367
+
368
+ </div>
369
+
370
+ <br>
371
+
372
+ <table>
373
+ <tr>
374
+ <td width="50%" valign="top">
375
+
376
+ **Hardware**
377
+
378
+ | Component | Specification |
379
+ |:----------|:-------------|
380
+ | GPU | NVIDIA A100 40GB |
381
+ | Precision | BF16 native |
382
+ | Framework | Unsloth 2026.4 + TRL |
383
+ | Flash Attention | Xformers fallback |
384
+ | Cloud Compute | Google Colab Pro |
385
+ | Training Time | ~90 minutes |
386
+ | Total Cost | < $15 USD |
387
+
388
+ </td>
389
+ <td width="50%" valign="top">
390
+
391
+ **LoRA Configuration**
392
+
393
+ | Parameter | Value |
394
+ |:----------|:-----:|
395
+ | Rank (r) | 16 |
396
+ | Alpha | 32 |
397
+ | Dropout | 0.0 |
398
+ | Target Modules | q, k, v, o, gate, up, down |
399
+ | Trainable Parameters | 18.4M (1.18%) |
400
+ | Gradient Checkpointing | Unsloth smart offload |
401
+ | Quantization | 4-bit NF4 (QLoRA) |
402
+
403
+ </td>
404
+ </tr>
405
+ </table>
406
+
407
+ <br>
408
+
409
+ **Optimizer & Training Configuration**
410
+
411
+ | Parameter | Value |
412
+ |:----------|:-----:|
413
+ | Optimizer | AdamW 8-bit |
414
+ | Learning Rate | 2e-4 |
415
+ | LR Scheduler | Cosine |
416
+ | Warmup Steps | 100 |
417
+ | Weight Decay | 0.01 |
418
+ | Per-device Batch Size | 4 |
419
+ | Gradient Accumulation | 8 |
420
+ | Effective Batch Size | 32 |
421
+ | Max Sequence Length | 4,096 tokens |
422
+ | Epochs | 2 |
423
+ | Total Steps | ~1,376 |
424
+
425
+ <br>
426
+
427
+ ### Dataset
428
+
429
+ Training used **OpceanAI/Yuuki-Personality-v2**, a 22,000-example bilingual dataset in ChatML format with native `<think>` reasoning blocks. The dataset was constructed through a multi-source distillation process:
430
+
431
+ - **Kimi K2** — base dataset generation at scale
432
+ - **Gemini** — think block generation and reasoning structure
433
+ - **Claude Opus** — think block refinement and quality improvement
434
+
435
+ The dataset covers conversational reasoning, factual Q&A, mathematical problem-solving, code assistance, identity anchoring, and adversarial resistance across English and Spanish.
436
+
437
+ The RxG Nano fine-tuning objective was identity installation — establishing the YuuKi character over the VibeThinker base without degrading the base model's reasoning capability. This was verified post-training by comparing AIME 2024 scores before and after fine-tuning.
438
+
439
+ <br>
440
+
441
+ ### Training Rationale
442
+
443
+ The choice of VibeThinker-1.5B as base model over alternatives (DeepSeek-R1-Distill-1.5B, Qwen3.5-2B) was informed by benchmark comparison:
444
+
445
+ | Model | AIME 2024 | MMLU-Pro | Notes |
446
+ |:------|:---------:|:--------:|:------|
447
+ | DeepSeek-R1-Distill-1.5B | 28.9% | — | SFT only, no RL stage |
448
+ | Qwen3.5-2B | — | 55.3% | Thinking disabled by default at small scale |
449
+ | **VibeThinker-1.5B** | **~80%** | **~65%** | SFT + RL distillation from frontier models |
450
+
451
+ VibeThinker applies both SFT and RL distillation from multiple frontier teachers — the same principle as DeepSeek-R1 distillation, but with a broader and more diverse teacher set. This produces a significantly stronger reasoning foundation at 1.5B scale.
452
+
453
+ <br>
454
+
455
+ ---
456
+
457
+ <br>
458
+
459
+ <div align="center">
460
+
461
+ ## Limitations
462
+
463
+ </div>
464
+
465
+ <br>
466
+
467
+ - **Context length.** Fine-tuning was conducted at 4,096 tokens. The base model supports longer contexts, but performance on tasks requiring context beyond 4,096 tokens has not been formally evaluated.
468
+ - **GPQA Diamond gap.** RxG Nano scores 50.9% on GPQA Diamond, below frontier models (Gemini-2.5-Flash at 82.8%, o3-mini at 76.8%). This benchmark requires graduate-level physics, chemistry, and biology knowledge that is underrepresented in the Yuuki training dataset.
469
+ - **OlympiadBench ceiling.** 44.6% reflects the upper bound of competition mathematics capability at 1.5B scale with current training methodology. This is a target for improvement in RxG NxG.
470
+ - **Think block quality.** Some `<think>` blocks inherit boilerplate patterns from the training dataset. Reasoning quality is variable — stronger for mathematics and logic, weaker for open-ended knowledge retrieval.
471
+ - **Safety alignment** has not been formally evaluated under adversarial conditions. Not recommended for safety-critical deployment without additional review.
472
+ - **HLE at 8.0%.** Humanity's Last Exam performance reflects genuine capability limits at this scale. The score was evaluated using Claude Sonnet 4.6 as judge, which may introduce evaluation variance.
473
+
474
+ <br>
475
+
476
+ ---
477
+
478
+ <br>
479
+
480
+ <div align="center">
481
+
482
+ ## The RxG Family
483
+
484
+ </div>
485
+
486
+ <br>
487
+
488
+ RxG is the reasoning-specialized lineage within the OpceanAI ecosystem. Each release targets a specific parameter regime and deployment context.
489
+
490
+ | Model | Parameters | Status | Base | Primary Target |
491
+ |:------|:----------:|:------:|:----:|:---------------|
492
+ | **YuuKi RxG Nano** | **1.5B** | **Released** | **VibeThinker-1.5B** | **Edge deployment, reasoning baseline** |
493
+ | YuuKi RxG 8B | 8B | Released | DeepSeek-R1-Distill-Qwen-8B | General reasoning, competition math |
494
+ | YuuKi RxG VL 27B | 27B | Planned | TBD | Multimodal reasoning, flagship |
495
+
496
+ <br>
497
+
498
+ ---
499
+
500
+ <br>
501
+
502
+ <div align="center">
503
+
504
+ ## OpceanAI Ecosystem
505
+
506
+ </div>
507
+
508
+ <br>
509
+
510
+ | Model | Family | Parameters | Description |
511
+ |:------|:------:|:----------:|:------------|
512
+ | [YuuKi RxG Nano](https://huggingface.co/OpceanAI/Yuuki-RxG-nano) | RxG | 1.5B | Edge reasoning, AIME 80.0%, TruthfulQA 89.6% |
513
+ | [YuuKi RxG 8B](https://huggingface.co/OpceanAI/Yuuki-RxG) | RxG | 8B | Reasoning flagship, TruthfulQA 96.6% |
514
+ | [Yumo Nano](https://huggingface.co/OpceanAI/yumo-nano) | Yumo | 1.5B | Math specialist, surpasses DeepScaleR |
515
+ | [YuuKi NxG VL](https://huggingface.co/OpceanAI/Yuuki-NxG-VL) | NxG | 7B | General conversation + vision |
516
+
517
+ <br>
518
+
519
+ ---
520
+
521
+ <br>
522
+
523
+ <div align="center">
524
+
525
+ ## Links
526
+
527
+ </div>
528
+
529
+ <br>
530
+
531
+ <div align="center">
532
+
533
+ [![Model Weights](https://img.shields.io/badge/Model_Weights-Hugging_Face-ffd21e?style=for-the-badge&logo=huggingface&logoColor=black)](https://huggingface.co/OpceanAI/Yuuki-RxG-nano)
534
+ &nbsp;
535
+ [![OpceanAI](https://img.shields.io/badge/OpceanAI-Organization-1a1a2e?style=for-the-badge&logo=huggingface&logoColor=white)](https://huggingface.co/OpceanAI)
536
+ &nbsp;
537
+ [![RxG 8B](https://img.shields.io/badge/YuuKi_RxG_8B-Flagship-6d28d9?style=for-the-badge&logo=huggingface&logoColor=white)](https://huggingface.co/OpceanAI/Yuuki-RxG)
538
+
539
+ <br>
540
+
541
+ [![GitHub](https://img.shields.io/badge/GitHub-aguitauwu-181717?style=for-the-badge&logo=github&logoColor=white)](https://github.com/aguitauwu)
542
+ &nbsp;
543
+ [![Sponsor](https://img.shields.io/badge/Sponsor-GitHub_Sponsors-ea4aaa?style=for-the-badge&logo=githubsponsors&logoColor=white)](https://github.com/sponsors/aguitauwu)
544
+ &nbsp;
545
+ [![Discord](https://img.shields.io/badge/Discord-Community-5865F2?style=for-the-badge&logo=discord&logoColor=white)](https://discord.gg/j8zV2u8k)
546
+
547
+ </div>
548
+
549
+ <br>
550
+
551
+ ---
552
+
553
+ <br>
554
+
555
+ <div align="center">
556
+
557
+ ## Citation
558
+
559
+ </div>
560
+
561
+ <br>
562
+
563
+ ```bibtex
564
+ @misc{awa_omg_2026_rxg_nano,
565
+ author = { awa_omg },
566
+ title = { Yuuki-RxG-nano (Revision 1.0) },
567
+ year = 2026,
568
+ url = { https://huggingface.co/OpceanAI/Yuuki-RxG-nano },
569
+ publisher = { Hugging Face }
570
+ }
571
+ ```
572
+
573
+ <br>
574
+
575
+ ---
576
+
577
+ <br>
578
+
579
+ <div align="center">
580
+
581
+ ## License
582
+
583
+ </div>
584
+
585
+ <br>
586
+
587
+ ```
588
+ Apache License 2.0
589
+
590
+ Copyright (c) 2026 OpceanAI
591
+
592
+ Licensed under the Apache License, Version 2.0 (the "License");
593
+ you may not use this file except in compliance with the License.
594
+ You may obtain a copy of the License at
595
+
596
+ http://www.apache.org/licenses/LICENSE-2.0
597
+
598
+ Unless required by applicable law or agreed to in writing, software
599
+ distributed under the License is distributed on an "AS IS" BASIS,
600
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
601
+ See the License for the specific language governing permissions and
602
+ limitations under the License.
603
+ ```
604
+
605
+ Inherits license terms from [VibeThinker-1.5B](https://huggingface.co/WeiboAI/VibeThinker-1.5B) and [Qwen2.5-Math-1.5B](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B).
606
+
607
+ <br>
608
+
609
+ ---
610
+
611
+ <br>
612
+
613
+ <div align="center">
614
+
615
+ ## Updates
616
+
617
+ </div>
618
+
619
+ <br>
620
+
621
+ | Date | Milestone |
622
+ |:-----|:----------|
623
+ | **2026-04-27** | MMLU-Pro 65.63% — exceeds DeepSeek V3 671B |
624
+ | **2026-04-27** | AIME 2024 80.0% — 2.77× DeepSeek-R1-Distill-1.5B |
625
+ | **2026-04-27** | TruthfulQA MC1 89.6% (1-shot) verified |
626
+ | **2026-04-27** | HLE 8.0% evaluated with Claude Sonnet 4.6 judge |
627
+ | **2026-04-27** | YuuKi RxG Nano v1.0 released on Hugging Face |
628
+
629
+ **Last updated:** 2026-04-27
630
+
631
+ <br>
632
+
633
+ ---
634
+
635
+ <br>
636
+
637
+ <div align="center">
638
+
639
+ **1.5B parameters. 90 minutes of training. Under $15 of compute.**<br>
640
+ **AIME 2024 at 80.0%. MMLU-Pro exceeding a 671B model.**<br>
641
+ **This is what frontier distillation makes possible at the edge.**
642
+
643
+ <br>
644
+
645
+ [![OpceanAI](https://img.shields.io/badge/OpceanAI-2026-0D1117?style=for-the-badge)](https://huggingface.co/OpceanAI)
646
+
647
+ <br>
648
+
649
+ *The RxG family. Built under constraints. No excuses.*
650
+
651
+ </div>