Mingke977 commited on
Commit
a272e55
·
verified ·
1 Parent(s): 13fdcbe

Add files using upload-large-folder tool

Browse files
Files changed (1) hide show
  1. README.md +413 -45
README.md CHANGED
@@ -1,49 +1,417 @@
1
  ---
2
- base_model: []
 
 
 
3
  library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
  ---
9
- # c362_step50_ta05
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
-
13
- ## Merge Details
14
- ### Merge Method
15
-
16
- This model was merged using the [Linear DARE](https://arxiv.org/abs/2311.03099) merge method using /root/myCodeLab/host/downloads/models/40Bra as a base.
17
-
18
- ### Models Merged
19
-
20
- The following models were included in the merge:
21
- * /root/myCodeLab/host/verl/ckpts/40bra_k8s_single_domain/40bra_k8s_16node_sd_c362_20260327_205644_unknown/global_step_50/actor/huggingface
22
-
23
- ### Configuration
24
-
25
- The following YAML configuration was used to produce this model:
26
-
27
- ```yaml
28
- base_model: /root/myCodeLab/host/downloads/models/40Bra
29
- dtype: float32
30
- merge_method: dare_linear
31
- modules:
32
- default:
33
- slices:
34
- - sources:
35
- - layer_range: [0, 40]
36
- model: /root/myCodeLab/host/downloads/models/40Bra
37
- - layer_range: [0, 40]
38
- model: /root/myCodeLab/host/verl/ckpts/40bra_k8s_single_domain/40bra_k8s_16node_sd_c362_20260327_205644_unknown/global_step_50/actor/huggingface
39
- parameters:
40
- density: 1.0
41
- weight:
42
- - filter: .mlp.gate.
43
- value: 0.0
44
- - value: 0.5
45
- - sources:
46
- - layer_range: [40, 41]
47
- model: /root/myCodeLab/host/downloads/models/40Bra
48
- out_dtype: bfloat16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
  ```
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - zh
4
+ - en
5
+ pipeline_tag: text-generation
6
  library_name: transformers
 
 
 
 
7
  ---
8
+ <div align="center">
9
+ <picture>
10
+ <img src="figures/joyai-logo.png" width="30%" alt="JoyAI-LLM Flash">
11
+ </picture>
12
+ </div>
13
+ <hr>
14
+
15
+ <div align="center" style="line-height: 1;">
16
+ <a href="https://huggingface.co/jdopensource" target="_blank"><img
17
+ alt="Hugging Face"
18
+ src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-JD-ffc107?color=ffc107&logoColor=white"/></a>
19
+ <a
20
+ href="https://huggingface.co/jdopensource/JoyAI-LLM-Flash/blob/main/LICENSE"><img
21
+ alt="License"
22
+ src="https://img.shields.io/badge/License-Modified_MIT-f5de53?&color=f5de53"/></a>
23
+ </div>
24
+
25
+ <p align="center">
26
+ <b>📰&nbsp;&nbsp;<a
27
+ href="https://huggingface.co/jdopensource/JoyAI-LLM-Flash/blob/main/JoyAI_Flash_techreport.pdf">Tech
28
+ Report</a>
29
+ </p>
30
+
31
+
32
+
33
+ ## 1. Model Introduction
34
+
35
+ JoyAI-LLM Flash is a state-of-the-art medium-sized instruct language model with
36
+ 3 billion activated parameters and 48 billion total parameters. JoyAI-LLM Flash
37
+ was pretrained on 20 trillion text tokens using Muon optimizer, followed by
38
+ large-scale supervised fine-tuning (SFT), direct preference optimization (DPO),
39
+ and reinforcement learning (RL) across diverse environments. JoyAI-LLM Flash
40
+ achieves strong performance across frontier knowledge, reasoning, coding tasks
41
+ and agentic capabilities.
42
+
43
+ ### Key Features
44
+
45
+ - Fibration Policy Optimization: Introduces fiber bundle theory into
46
+ reinforcement learning, proposing a novel optimization framework, FiberPO.
47
+ This method is specifically designed to handle the challenges of large-scale
48
+ and heterogeneous agent training, improving stability and robustness under
49
+ complex data distributions. [paper link](https://arxiv.org/abs/2603.08239)
50
+ - Training-Inference Collaboration: apply Muon optimizer with dense MTP,
51
+ develop novel optimization techniques to resolve instabilities while scaling
52
+ up, delivering 1.3× to 1.7× the throughput of the non-MTP version.
53
+ - Agentic Intelligence: designed for tool use, reasoning, and autonomous
54
+ problem-solving.
55
+
56
+ ## 2. Model Summary
57
+
58
+ | | |
59
+ | :-----------------------------------------: | :----------------------: |
60
+ | **Architecture** | Mixture-of-Experts (MoE) |
61
+ | **Total Parameters** | 48B |
62
+ | **Activated Parameters** | 3B |
63
+ | **Number of Layers** (Dense layer included) | 40 |
64
+ | **Number of Dense Layers** | 1 |
65
+ | **Attention Hidden Dimension** | 2048 |
66
+ | **MoE Hidden Dimension** (per Expert) | 768 |
67
+ | **Number of Attention Heads** | 32 |
68
+ | **Number of Experts** | 256 |
69
+ | **Selected Experts per Token** | 8 |
70
+ | **Number of Shared Experts** | 1 |
71
+ | **Vocabulary Size** | 129K |
72
+ | **Context Length** | 128K |
73
+ | **Attention Mechanism** | MLA |
74
+ | **Activation Function** | SwiGLU |
75
+ | </div> | |
76
+
77
+
78
+ ## 3. Evaluation Results
79
+
80
+ <table>
81
+ <thead>
82
+ <tr>
83
+ <th align="center">Benchmark</th>
84
+ <th align="center"><sup>JoyAI-LLM Flash</sup></th>
85
+ <th align="center"><sup>Qwen3-30B-A3B-Instuct-2507</sup></th>
86
+ <th align="center"><sup>GLM-4.7-Flash<br>(Non-thinking)</sup></th>
87
+ </tr>
88
+ </thead>
89
+ <tbody>
90
+
91
+
92
+ <tr>
93
+ <td align="center" colspan=8><strong>Knowledge &amp; Alignment</strong></td>
94
+ </tr>
95
+ <tr>
96
+ <td align="center" style="vertical-align: middle">MMLU</td>
97
+ <td align="center" style="vertical-align: middle"><strong>89.50</strong></td>
98
+ <td align="center" style="vertical-align: middle">86.87</td>
99
+ <td align="center" style="vertical-align: middle">80.53</td>
100
+ </tr>
101
+ <tr>
102
+ <td align="center" style="vertical-align: middle">MMLU-Pro</td>
103
+ <td align="center" style="vertical-align: middle"><strong>81.02</strong></td>
104
+ <td align="center" style="vertical-align: middle">73.88</td>
105
+ <td align="center" style="vertical-align: middle">63.62</td>
106
+ </tr>
107
+ <tr>
108
+ <td align="center" style="vertical-align: middle">CMMLU</td>
109
+ <td align="center" style="vertical-align: middle"><strong>87.03</strong></td>
110
+ <td align="center" style="vertical-align: middle">85.88</td>
111
+ <td align="center" style="vertical-align: middle">75.85</td>
112
+ </tr>
113
+ <tr>
114
+ <td align="center" style="vertical-align: middle">GPQA-Diamond</td>
115
+ <td align="center" style="vertical-align: middle"><strong>74.43</strong></td>
116
+ <td align="center" style="vertical-align: middle">68.69</td>
117
+ <td align="center" style="vertical-align: middle">39.90</td>
118
+ </tr>
119
+ <tr>
120
+ <td align="center" style="vertical-align: middle">SuperGPQA</td>
121
+ <td align="center" style="vertical-align: middle"><strong>55.00</strong></td>
122
+ <td align="center" style="vertical-align: middle">52.00</td>
123
+ <td align="center" style="vertical-align: middle">32.00</td>
124
+ </tr>
125
+ <tr>
126
+ <td align="center" style="vertical-align: middle">LiveBench</td>
127
+ <td align="center" style="vertical-align: middle"><strong>72.90</strong></td>
128
+ <td align="center" style="vertical-align: middle">59.70</td>
129
+ <td align="center" style="vertical-align: middle">43.10</td>
130
+ </tr>
131
+ <tr>
132
+ <td align="center" style="vertical-align: middle">IFEval</td>
133
+ <td align="center" style="vertical-align: middle"><strong>86.69</strong></td>
134
+ <td align="center" style="vertical-align: middle">83.18</td>
135
+ <td align="center" style="vertical-align: middle">82.44</td>
136
+ </tr>
137
+ <tr>
138
+ <td align="center" style="vertical-align: middle">AlignBench</td>
139
+ <td align="center" style="vertical-align: middle"><strong>8.24</strong></td>
140
+ <td align="center" style="vertical-align: middle">8.07</td>
141
+ <td align="center" style="vertical-align: middle">6.85</td>
142
+ </tr>
143
+ <tr>
144
+ <td align="center" style="vertical-align: middle">HellaSwag</td>
145
+ <td align="center" style="vertical-align: middle"><strong>91.79</strong></td>
146
+ <td align="center" style="vertical-align: middle">89.90</td>
147
+ <td align="center" style="vertical-align: middle">60.84</td>
148
+ </tr>
149
+
150
+ <tr>
151
+ <td align="center" colspan=8><strong>Coding</strong></td>
152
+ </tr>
153
+ <tr>
154
+ <td align="center" style="vertical-align: middle">HumanEval</td>
155
+ <td align="center" style="vertical-align: middle"><strong>96.34</strong></td>
156
+ <td align="center" style="vertical-align: middle">95.12</td>
157
+ <td align="center" style="vertical-align: middle">74.39</td>
158
+ </tr>
159
+ <tr>
160
+ <td align="center" style="vertical-align: middle">LiveCodeBench</td>
161
+ <td align="center" style="vertical-align: middle"><strong>65.60</strong></td>
162
+ <td align="center" style="vertical-align: middle">39.71</td>
163
+ <td align="center" style="vertical-align: middle">27.43</td>
164
+ </tr>
165
+ <tr>
166
+ <td align="center" style="vertical-align: middle">SciCode</td>
167
+ <td align="center" style="vertical-align:
168
+ middle"><strong>3.08/22.92</strong></td>
169
+ <td align="center" style="vertical-align:
170
+ middle"><strong>3.08/22.92</strong></td>
171
+ <td align="center" style="vertical-align: middle">3.08/15.11</td>
172
+ </tr>
173
+ <tr>
174
+ <td align="center" colspan=8><strong>Mathematics</strong></td>
175
+ </tr>
176
+ <tr>
177
+ <td align="center" style="vertical-align: middle">GSM8K</td>
178
+ <td align="center" style="vertical-align: middle"><strong>95.83</strong></td>
179
+ <td align="center" style="vertical-align: middle">79.83</td>
180
+ <td align="center" style="vertical-align: middle">81.88</td>
181
+ </tr>
182
+ <tr>
183
+ <td align="center" style="vertical-align: middle">AIME2025</td>
184
+ <td align="center" style="vertical-align: middle"><strong>65.83</strong></td>
185
+ <td align="center" style="vertical-align: middle">62.08</td>
186
+ <td align="center" style="vertical-align: middle">24.17</td>
187
+ </tr>
188
+ <tr>
189
+ <td align="center" style="vertical-align: middle">MATH 500</td>
190
+ <td align="center" style="vertical-align: middle"><strong>97.10</strong></td>
191
+ <td align="center" style="vertical-align: middle">89.80</td>
192
+ <td align="center" style="vertical-align: middle">90.90</td>
193
+ </tr>
194
+
195
+ <tr>
196
+ <td align="center" colspan=8><strong>Agentic</strong></td>
197
+ </tr>
198
+ <tr>
199
+ <td align="center" style="vertical-align: middle">SWE-bench Verified</td>
200
+ <td align="center" style="vertical-align: middle"><strong>60.60</strong></td>
201
+ <td align="center" style="vertical-align: middle">24.44</td>
202
+ <td align="center" style="vertical-align: middle">51.60</td>
203
+ </tr>
204
+ <tr>
205
+ <td align="center" style="vertical-align: middle">Tau2-Retail</td>
206
+ <td align="center" style="vertical-align: middle"><strong>67.55</strong></td>
207
+ <td align="center" style="vertical-align: middle">53.51</td>
208
+ <td align="center" style="vertical-align: middle">62.28</td>
209
+ </tr>
210
+ <tr>
211
+ <td align="center" style="vertical-align: middle">Tau2-Airline</td>
212
+ <td align="center" style="vertical-align: middle"><strong>54.00</strong></td>
213
+ <td align="center" style="vertical-align: middle">32.00</td>
214
+ <td align="center" style="vertical-align: middle">52.00</td>
215
+ </tr>
216
+ <tr>
217
+ <td align="center" style="vertical-align: middle">Tau2-Telecom</td>
218
+ <td align="center" style="vertical-align: middle">79.83</td>
219
+ <td align="center" style="vertical-align: middle">4.39</td>
220
+ <td align="center" style="vertical-align: middle"><strong>88.60</strong></td>
221
+ </tr>
222
+
223
+ <tr>
224
+ <td align="center" colspan=8><strong>Long Context</strong></td>
225
+ </tr>
226
+ <tr>
227
+ <td align="center" style="vertical-align: middle">RULER</td>
228
+ <td align="center" style="vertical-align: middle"><strong>95.60</strong></td>
229
+ <td align="center" style="vertical-align: middle">89.66</td>
230
+ <td align="center" style="vertical-align: middle">56.12</td>
231
+ </tr>
232
+ </tbody>
233
+ </table>
234
+
235
+
236
+ ## 4. Deployment
237
+
238
+ > [!Note]
239
+ > You can access JoyAI-LLM Flash API on https://docs.jdcloud.com/cn/jdaip/chat
240
+ > and we provide OpenAI/Anthropic-compatible API for you.
241
+ > Currently, JoyAI-LLM Flash is recommended to run on the following inference
242
+ > engines:
243
+
244
+ * vLLM
245
+ * SGLang
246
+
247
+ The minimum version requirement for `transformers` is `4.57.1`.
248
+
249
+ Deployment examples can be found in the [Model Deployment
250
+ Guide](docs/deploy_guidance.md).
251
+
252
+
253
+
254
+ ## 5. Model Usage
255
+
256
+ The usage demos below demonstrate how to call our official API.
257
+
258
+ For third-party APIs deployed with vLLM or SGLang, please note that:
259
+
260
+ > [!Note] Recommended sampling parameters: `temperature=0.6`, `top_p=1.0`
261
+
262
+ ### Chat Completion
263
+
264
+ This is a simple chat completion script which shows how to call JoyAI-Flash
265
+ API.
266
+
267
+ ```python
268
+ from openai import OpenAI
269
+
270
+ client = OpenAI(base_url="http://IP:PORT/v1", api_key="EMPTY")
271
+
272
+
273
+ def simple_chat(client: OpenAI):
274
+ messages = [
275
+ {
276
+ "role": "user",
277
+ "content": [
278
+ {
279
+ "type": "text",
280
+ "text": "which one is bigger, 9.11 or 9.9? think
281
+ carefully.",
282
+ }
283
+ ],
284
+ },
285
+ ]
286
+ model_name = client.models.list().data[0].id
287
+ response = client.chat.completions.create(
288
+ model=model_name, messages=messages, stream=False, max_tokens=4096
289
+ )
290
+ print(f"response: {response.choices[0].message.content}")
291
+
292
+
293
+ if __name__ == "__main__":
294
+ simple_chat(client)
295
+ ```
296
+
297
+
298
+ ### Tool call Completion
299
+
300
+ This is a simple toll call completion script which shows how to call
301
+ JoyAI-Flash API.
302
+
303
+ ```python
304
+ import json
305
+
306
+ from openai import OpenAI
307
+
308
+ client = OpenAI(base_url="http://IP:PORT/v1", api_key="EMPTY")
309
+
310
+
311
+ def my_calculator(expression: str) -> str:
312
+ return str(eval(expression))
313
+
314
+
315
+ def rewrite(expression: str) -> str:
316
+ return str(expression)
317
+
318
+
319
+ def simple_tool_call(client: OpenAI):
320
+ messages = [
321
+ {
322
+ "role": "user",
323
+ "content": [
324
+ {
325
+ "type": "text",
326
+ "text": "use my functions to compute the results for the
327
+ equations: 6+1",
328
+ },
329
+ ],
330
+ },
331
+ ]
332
+ tools = [
333
+ {
334
+ "type": "function",
335
+ "function": {
336
+ "name": "my_calculator",
337
+ "description": "A calculator that can evaluate a mathematical
338
+ equation and compute its results.",
339
+ "parameters": {
340
+ "type": "object",
341
+ "properties": {
342
+ "expression": {
343
+ "type": "string",
344
+ "description": "The mathematical expression to
345
+ evaluate.",
346
+ },
347
+ },
348
+ "required": ["expression"],
349
+ },
350
+ },
351
+ },
352
+ {
353
+ "type": "function",
354
+ "function": {
355
+ "name": "rewrite",
356
+ "description": "Rewrite a given text for improved clarity",
357
+ "parameters": {
358
+ "type": "object",
359
+ "properties": {
360
+ "text": {
361
+ "type": "string",
362
+ "description": "The input text to rewrite",
363
+ }
364
+ },
365
+ },
366
+ },
367
+ },
368
+ ]
369
+ model_name = client.models.list().data[0].id
370
+ response = client.chat.completions.create(
371
+ model=model_name,
372
+ messages=messages,
373
+ temperature=1.0,
374
+ max_tokens=1024,
375
+ tools=tools,
376
+ tool_choice="auto",
377
+ )
378
+ tool_calls = response.choices[0].message.tool_calls
379
+
380
+ results = []
381
+ for tool_call in tool_calls:
382
+ function_name = tool_call.function.name
383
+ function_args = tool_call.function.arguments
384
+ if function_name == "my_calculator":
385
+ result = my_calculator(**json.loads(function_args))
386
+ results.append(result)
387
+ messages.append({"role": "assistant", "tool_calls": tool_calls})
388
+ for tool_call, result in zip(tool_calls, results):
389
+ messages.append(
390
+ {
391
+ "role": "tool",
392
+ "tool_call_id": tool_call.id,
393
+ "name": tool_call.function.name,
394
+ "content": result,
395
+ }
396
+ )
397
+ response = client.chat.completions.create(
398
+ model=model_name,
399
+ messages=messages,
400
+ temperature=1.0,
401
+ max_tokens=1024,
402
+ )
403
+ print(response.choices[0].message.content)
404
+
405
+
406
+ if __name__ == "__main__":
407
+ simple_tool_call(client)
408
+
409
  ```
410
+
411
+ ---
412
+
413
+ ## 6. License
414
+
415
+ Both the code repository and the model weights are released under the [Modified
416
+ MIT License](LICENSE).
417
+