spicyneuron commited on
Commit
4a65b34
·
1 Parent(s): f16a072

Add files using upload-large-folder tool

Browse files
.DS_Store ADDED
Binary file (6.15 kB). View file
 
README.md CHANGED
@@ -8,47 +8,3 @@ tags:
8
  - mlx
9
  base_model: MiniMaxAI/MiniMax-M2.7
10
  ---
11
-
12
- [MiniMax-M2.7](MiniMaxAI/MiniMax-M2.7) optimized for MLX. A mixed-precision quant that balances speed, memory, and accuracy.
13
-
14
- # Usage
15
-
16
- ```sh
17
- # Start server at http://localhost:8080/chat/completions
18
- uvx --from mlx-lm mlx_lm.server \
19
- --host 127.0.0.1 \
20
- --port 8080 \
21
- --model spicyneuron/MiniMax-M2.7-MLX-4.6bit
22
- ```
23
-
24
- # Methodology
25
-
26
- Quantized with a [mlx-lm fork](https://github.com/ml-explore/mlx-lm/pull/922), drawing inspiration from Unsloth/AesSedai/ubergarm style mixed-precision GGUFs.
27
- MLX quantization options differ than llama.cpp, but the principles are the same:
28
-
29
- - Sensitive layers like MoE routing, attention, and output embeddings get higher precision
30
- - More tolerant layers like MoE experts get lower precision
31
-
32
- # Benchmarks
33
-
34
- metric | mlx-community_MiniMax-M2.7-4bit | baa-ai_MiniMax-M2.7-RAM-155GB-MLX | 4.6 bit (this model)
35
- --- | --- | --- | ---
36
- bpw | 4.501 | 5.4278 | 4.5987
37
- peak memory (1024/512) | 129.632 | 156.051 | 132.442
38
- prompt tok/s (1024) | 739.996 ± 1.565 | 708.147 ± 0.818 | 740.409 ± 0.268
39
- gen tok/s (512) | 48.703 ± 0.116 | 40.253 ± 0.077 | 48.038 ± 0.099
40
- perplexity | 9.120 ± 0.047 | 8.835 ± 0.045 | 4.462 ± 0.019
41
- hellaswag | 0.504 ± 0.011 | 0.509 ± 0.011 | 0.505 ± 0.011
42
- piqa | 0.786 ± 0.01 | 0.787 ± 0.01 | 0.793 ± 0.009
43
- winogrande | 0.636 ± 0.014 | 0.661 ± 0.013 | 0.645 ± 0.013
44
-
45
- Tested on a Mac Studio M3 Ultra with:
46
-
47
- ```
48
- mlx_lm.perplexity --sequence-length 2048 --seed 123
49
- mlx_lm.benchmark --prompt-tokens 1024 --generation-tokens 512 --num-trials 5
50
- mlx_lm.evaluate --tasks hellaswag --seed 123 --num-shots 0 --limit 2000
51
- mlx_lm.evaluate --tasks piqa --seed 123 --num-shots 0 --limit 2000
52
- mlx_lm.evaluate --tasks winogrande --seed 123 --num-shots 0 --limit 2000
53
- ```
54
-
 
8
  - mlx
9
  base_model: MiniMaxAI/MiniMax-M2.7
10
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
chat_template.jinja CHANGED
@@ -32,7 +32,7 @@
32
  {%- endif -%}
33
  {{- model_identity }}
34
  {%- endif -%}
35
-
36
  {#- Handle current_date -#}
37
  {%- if system_message and system_message.current_date -%}
38
  {{- '\n' ~ 'Current date: ' + system_message.current_date }}
@@ -116,14 +116,14 @@
116
  {% endfor %}
117
  {{- '</invoke>' ~ '\n' }}
118
  {%- endfor -%}
119
-
120
  {{- toolcall_end_token}}
121
  {%- set last_tool_call.name = message.tool_calls[-1].name -%}
122
  {%- else -%}
123
  {%- set last_tool_call.name = none -%}
124
  {%- endif -%}
125
  {{- '[e~[' ~ '\n' }}
126
-
127
  {%- elif message.role == 'tool' -%}
128
  {%- if last_tool_call.name is none -%}
129
  {{- raise_exception("Message has tool role, but there was no previous assistant message with a tool call!") }}
@@ -145,7 +145,7 @@
145
  {%- if loop.last or (conversation_messages[loop.index0 + 1].role != 'tool') -%}
146
  {{- '[e~[\n' -}}
147
  {%- endif -%}
148
-
149
  {%- elif message.role == 'user' -%}
150
  {{- ']~b]user' ~ '\n' }}
151
  {{- visible_text(message.content) }}
 
32
  {%- endif -%}
33
  {{- model_identity }}
34
  {%- endif -%}
35
+
36
  {#- Handle current_date -#}
37
  {%- if system_message and system_message.current_date -%}
38
  {{- '\n' ~ 'Current date: ' + system_message.current_date }}
 
116
  {% endfor %}
117
  {{- '</invoke>' ~ '\n' }}
118
  {%- endfor -%}
119
+
120
  {{- toolcall_end_token}}
121
  {%- set last_tool_call.name = message.tool_calls[-1].name -%}
122
  {%- else -%}
123
  {%- set last_tool_call.name = none -%}
124
  {%- endif -%}
125
  {{- '[e~[' ~ '\n' }}
126
+
127
  {%- elif message.role == 'tool' -%}
128
  {%- if last_tool_call.name is none -%}
129
  {{- raise_exception("Message has tool role, but there was no previous assistant message with a tool call!") }}
 
145
  {%- if loop.last or (conversation_messages[loop.index0 + 1].role != 'tool') -%}
146
  {{- '[e~[\n' -}}
147
  {%- endif -%}
148
+
149
  {%- elif message.role == 'user' -%}
150
  {{- ']~b]user' ~ '\n' }}
151
  {{- visible_text(message.content) }}
config.json CHANGED
The diff for this file is too large to render. See raw diff
 
model-00001-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1453e1d2176aba8f7d7f0c8ec41d36b190260cc5e09bf882b2dee461022747fe
3
+ size 5177037304
model-00002-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0572ba9610d5c45fb51db9d8592308fb18be7ab7150ed511bbe7f197e17f64ce
3
+ size 5155124140
model-00003-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1fbe7d62fbf61cd4bcf7948b8015bb360e22df0c026908c1c089a60569081c8a
3
+ size 5155124138
model-00004-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3835a700f3a2cbeb744bfc7f91890b5ccf2ed7676d7e5d03d5ab72bea4458819
3
+ size 5354514350
model-00005-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2cbf2f0c5b55f15e761ee0d438623bb1fce4cff9bde11b99318e25a7735b49b8
3
+ size 5155124199
model-00006-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd9cbdf4378822059bf3e4604a437a2121fa2157d2416fa2e10e35f1cda9c8af
3
+ size 5155124193
model-00007-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:acb259ba550c178ec5a85743eadb2b0b58eecc869eb1fba35e2dbd4b25618b11
3
+ size 5354514427
model-00008-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:11d784a38f27356319cbfe85ff52f436bd5b3589735d36552e801fd6d99921e9
3
+ size 5155124193
model-00009-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5fae71b190a4c7dc0cb412bdf6196d24fe322f19e78e989ec050674cf4c16a7f
3
+ size 5155124193
model-00010-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31c4d0e903bbe9e982d5992b8bf991a359dd2b1a552e97ee0a6ebc0a7161ab16
3
+ size 5354514409
model-00011-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e4f7f63241f0af1d07ad61f6298debb9325bbec2775a324b4b27abff385506e3
3
+ size 5155124223
model-00012-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0c34685375b3c296dd32df39a4e258952de296f76a859713a808800e6f2bf595
3
+ size 5155124171
model-00013-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e3d490a5a1136dff06ba6d2ac3c2b7ae8eaac1e43a042f4a5a51d8eb829db80
3
+ size 5354514431
model-00014-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a64552bb6eb6f5ccf456fbe6976caac8dbd0322c5fde6b5d985954134ae69c16
3
+ size 5155124203
model-00015-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bc5ba31651da410e39d6b9f431a8ba6a7638749ad40acc3bc6690c4badd74a6f
3
+ size 5155124215
model-00016-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3f1b942b84d35e5b8dff79d16db0f42d4c2775564e1e378f33b7e3f5b6bb375c
3
+ size 5354514393
model-00017-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:648e440326f40ec50b1d0d9f5087ee2972888429c9b06c1e9dfb1f295468bed4
3
+ size 5155124211
model-00018-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b89bf73e3bab098ba58694a01aa3014b742ba78d00acc42e22346e07dc01c86
3
+ size 5155124175
model-00019-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ba8b22b79b1a719c33d6c63b6de8e348b4c2579421c9d3f3fccc30f349cec194
3
+ size 5354514449
model-00020-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f29c27c8a574f102d6d1ca6cda1958bf4989724be734e1dd8f516d5ecbe6f516
3
+ size 5155124219
model-00021-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3e79f793f81df310666d32d5767ffdb0e3027e03cb0650f0f40bceaba36028a8
3
+ size 5155124199
model-00022-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:beb0be8dacf73419602d54995d28ef911c60436c0ba46a0416aa0a660e8954ff
3
+ size 5354514415
model-00023-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:34a1eadb3a850bc07ccb99c50b55aae61906c1a7c376bbe1ef9cecf921e11c48
3
+ size 5155124205
model-00024-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3236bdd8c77646e820a6fc0e063ad6eb2773e620112caa46aa8744bb7c85dd1f
3
+ size 5155124215
model-00025-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8c96268c2039d21d786afcb7c82fbca99355fa1112863cc6778bc0a0e51553b6
3
+ size 5354514437
model-00026-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:45730088b256deabd29422cf786faa69b94f529080c3671048552dfc2a41ea52
3
+ size 5306119123
model-00027-of-00027.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:92f46277657390e18f42fca767fa4ede4dd871ab72c779db18abb333b9f37774
3
+ size 4702792504
model.safetensors.index.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "metadata": {
3
- "total_size": 131460857856,
4
  "total_parameters": 228689748992
5
  },
6
  "weight_map": {
@@ -1036,10 +1036,10 @@
1036
  "model.layers.42.self_attn.v_proj.biases": "model-00019-of-00027.safetensors",
1037
  "model.layers.42.self_attn.v_proj.scales": "model-00019-of-00027.safetensors",
1038
  "model.layers.42.self_attn.v_proj.weight": "model-00019-of-00027.safetensors",
1039
- "model.layers.43.block_sparse_moe.e_score_correction_bias": "model-00020-of-00027.safetensors",
1040
  "model.layers.43.block_sparse_moe.gate.weight": "model-00019-of-00027.safetensors",
1041
- "model.layers.43.block_sparse_moe.switch_mlp.down_proj.biases": "model-00020-of-00027.safetensors",
1042
- "model.layers.43.block_sparse_moe.switch_mlp.down_proj.scales": "model-00020-of-00027.safetensors",
1043
  "model.layers.43.block_sparse_moe.switch_mlp.down_proj.weight": "model-00019-of-00027.safetensors",
1044
  "model.layers.43.block_sparse_moe.switch_mlp.gate_proj.biases": "model-00019-of-00027.safetensors",
1045
  "model.layers.43.block_sparse_moe.switch_mlp.gate_proj.scales": "model-00019-of-00027.safetensors",
@@ -1047,8 +1047,8 @@
1047
  "model.layers.43.block_sparse_moe.switch_mlp.up_proj.biases": "model-00019-of-00027.safetensors",
1048
  "model.layers.43.block_sparse_moe.switch_mlp.up_proj.scales": "model-00019-of-00027.safetensors",
1049
  "model.layers.43.block_sparse_moe.switch_mlp.up_proj.weight": "model-00019-of-00027.safetensors",
1050
- "model.layers.43.input_layernorm.weight": "model-00020-of-00027.safetensors",
1051
- "model.layers.43.post_attention_layernorm.weight": "model-00020-of-00027.safetensors",
1052
  "model.layers.43.self_attn.k_norm.weight": "model-00019-of-00027.safetensors",
1053
  "model.layers.43.self_attn.k_proj.biases": "model-00019-of-00027.safetensors",
1054
  "model.layers.43.self_attn.k_proj.scales": "model-00019-of-00027.safetensors",
@@ -1064,7 +1064,7 @@
1064
  "model.layers.43.self_attn.v_proj.scales": "model-00019-of-00027.safetensors",
1065
  "model.layers.43.self_attn.v_proj.weight": "model-00019-of-00027.safetensors",
1066
  "model.layers.44.block_sparse_moe.e_score_correction_bias": "model-00020-of-00027.safetensors",
1067
- "model.layers.44.block_sparse_moe.gate.weight": "model-00020-of-00027.safetensors",
1068
  "model.layers.44.block_sparse_moe.switch_mlp.down_proj.biases": "model-00020-of-00027.safetensors",
1069
  "model.layers.44.block_sparse_moe.switch_mlp.down_proj.scales": "model-00020-of-00027.safetensors",
1070
  "model.layers.44.block_sparse_moe.switch_mlp.down_proj.weight": "model-00020-of-00027.safetensors",
@@ -1076,20 +1076,20 @@
1076
  "model.layers.44.block_sparse_moe.switch_mlp.up_proj.weight": "model-00020-of-00027.safetensors",
1077
  "model.layers.44.input_layernorm.weight": "model-00020-of-00027.safetensors",
1078
  "model.layers.44.post_attention_layernorm.weight": "model-00020-of-00027.safetensors",
1079
- "model.layers.44.self_attn.k_norm.weight": "model-00020-of-00027.safetensors",
1080
- "model.layers.44.self_attn.k_proj.biases": "model-00020-of-00027.safetensors",
1081
- "model.layers.44.self_attn.k_proj.scales": "model-00020-of-00027.safetensors",
1082
- "model.layers.44.self_attn.k_proj.weight": "model-00020-of-00027.safetensors",
1083
- "model.layers.44.self_attn.o_proj.biases": "model-00020-of-00027.safetensors",
1084
- "model.layers.44.self_attn.o_proj.scales": "model-00020-of-00027.safetensors",
1085
- "model.layers.44.self_attn.o_proj.weight": "model-00020-of-00027.safetensors",
1086
- "model.layers.44.self_attn.q_norm.weight": "model-00020-of-00027.safetensors",
1087
- "model.layers.44.self_attn.q_proj.biases": "model-00020-of-00027.safetensors",
1088
- "model.layers.44.self_attn.q_proj.scales": "model-00020-of-00027.safetensors",
1089
- "model.layers.44.self_attn.q_proj.weight": "model-00020-of-00027.safetensors",
1090
- "model.layers.44.self_attn.v_proj.biases": "model-00020-of-00027.safetensors",
1091
- "model.layers.44.self_attn.v_proj.scales": "model-00020-of-00027.safetensors",
1092
- "model.layers.44.self_attn.v_proj.weight": "model-00020-of-00027.safetensors",
1093
  "model.layers.45.block_sparse_moe.e_score_correction_bias": "model-00020-of-00027.safetensors",
1094
  "model.layers.45.block_sparse_moe.gate.weight": "model-00020-of-00027.safetensors",
1095
  "model.layers.45.block_sparse_moe.switch_mlp.down_proj.biases": "model-00020-of-00027.safetensors",
 
1
  {
2
  "metadata": {
3
+ "total_size": 140503842816,
4
  "total_parameters": 228689748992
5
  },
6
  "weight_map": {
 
1036
  "model.layers.42.self_attn.v_proj.biases": "model-00019-of-00027.safetensors",
1037
  "model.layers.42.self_attn.v_proj.scales": "model-00019-of-00027.safetensors",
1038
  "model.layers.42.self_attn.v_proj.weight": "model-00019-of-00027.safetensors",
1039
+ "model.layers.43.block_sparse_moe.e_score_correction_bias": "model-00019-of-00027.safetensors",
1040
  "model.layers.43.block_sparse_moe.gate.weight": "model-00019-of-00027.safetensors",
1041
+ "model.layers.43.block_sparse_moe.switch_mlp.down_proj.biases": "model-00019-of-00027.safetensors",
1042
+ "model.layers.43.block_sparse_moe.switch_mlp.down_proj.scales": "model-00019-of-00027.safetensors",
1043
  "model.layers.43.block_sparse_moe.switch_mlp.down_proj.weight": "model-00019-of-00027.safetensors",
1044
  "model.layers.43.block_sparse_moe.switch_mlp.gate_proj.biases": "model-00019-of-00027.safetensors",
1045
  "model.layers.43.block_sparse_moe.switch_mlp.gate_proj.scales": "model-00019-of-00027.safetensors",
 
1047
  "model.layers.43.block_sparse_moe.switch_mlp.up_proj.biases": "model-00019-of-00027.safetensors",
1048
  "model.layers.43.block_sparse_moe.switch_mlp.up_proj.scales": "model-00019-of-00027.safetensors",
1049
  "model.layers.43.block_sparse_moe.switch_mlp.up_proj.weight": "model-00019-of-00027.safetensors",
1050
+ "model.layers.43.input_layernorm.weight": "model-00019-of-00027.safetensors",
1051
+ "model.layers.43.post_attention_layernorm.weight": "model-00019-of-00027.safetensors",
1052
  "model.layers.43.self_attn.k_norm.weight": "model-00019-of-00027.safetensors",
1053
  "model.layers.43.self_attn.k_proj.biases": "model-00019-of-00027.safetensors",
1054
  "model.layers.43.self_attn.k_proj.scales": "model-00019-of-00027.safetensors",
 
1064
  "model.layers.43.self_attn.v_proj.scales": "model-00019-of-00027.safetensors",
1065
  "model.layers.43.self_attn.v_proj.weight": "model-00019-of-00027.safetensors",
1066
  "model.layers.44.block_sparse_moe.e_score_correction_bias": "model-00020-of-00027.safetensors",
1067
+ "model.layers.44.block_sparse_moe.gate.weight": "model-00019-of-00027.safetensors",
1068
  "model.layers.44.block_sparse_moe.switch_mlp.down_proj.biases": "model-00020-of-00027.safetensors",
1069
  "model.layers.44.block_sparse_moe.switch_mlp.down_proj.scales": "model-00020-of-00027.safetensors",
1070
  "model.layers.44.block_sparse_moe.switch_mlp.down_proj.weight": "model-00020-of-00027.safetensors",
 
1076
  "model.layers.44.block_sparse_moe.switch_mlp.up_proj.weight": "model-00020-of-00027.safetensors",
1077
  "model.layers.44.input_layernorm.weight": "model-00020-of-00027.safetensors",
1078
  "model.layers.44.post_attention_layernorm.weight": "model-00020-of-00027.safetensors",
1079
+ "model.layers.44.self_attn.k_norm.weight": "model-00019-of-00027.safetensors",
1080
+ "model.layers.44.self_attn.k_proj.biases": "model-00019-of-00027.safetensors",
1081
+ "model.layers.44.self_attn.k_proj.scales": "model-00019-of-00027.safetensors",
1082
+ "model.layers.44.self_attn.k_proj.weight": "model-00019-of-00027.safetensors",
1083
+ "model.layers.44.self_attn.o_proj.biases": "model-00019-of-00027.safetensors",
1084
+ "model.layers.44.self_attn.o_proj.scales": "model-00019-of-00027.safetensors",
1085
+ "model.layers.44.self_attn.o_proj.weight": "model-00019-of-00027.safetensors",
1086
+ "model.layers.44.self_attn.q_norm.weight": "model-00019-of-00027.safetensors",
1087
+ "model.layers.44.self_attn.q_proj.biases": "model-00019-of-00027.safetensors",
1088
+ "model.layers.44.self_attn.q_proj.scales": "model-00019-of-00027.safetensors",
1089
+ "model.layers.44.self_attn.q_proj.weight": "model-00019-of-00027.safetensors",
1090
+ "model.layers.44.self_attn.v_proj.biases": "model-00019-of-00027.safetensors",
1091
+ "model.layers.44.self_attn.v_proj.scales": "model-00019-of-00027.safetensors",
1092
+ "model.layers.44.self_attn.v_proj.weight": "model-00019-of-00027.safetensors",
1093
  "model.layers.45.block_sparse_moe.e_score_correction_bias": "model-00020-of-00027.safetensors",
1094
  "model.layers.45.block_sparse_moe.gate.weight": "model-00020-of-00027.safetensors",
1095
  "model.layers.45.block_sparse_moe.switch_mlp.down_proj.biases": "model-00020-of-00027.safetensors",