hongyongjiang commited on
Commit
2799d27
·
verified ·
1 Parent(s): ac9542c

Upload folder using huggingface_hub

Browse files
qwen3-vl-4b/acts.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2a931f2b9373fc5d63f2d8386a808f56245ddd5e772428177a6c6a16fcdc2523
3
+ size 55975
qwen3-vl-4b/mmproj-fp16.npz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:56824b179176c4112ab1fb4a1005f7156b78aced2d0b6aa99b0b102e1eb28537
3
+ size 642657486
qwen3-vl-4b/mmproj.txt ADDED
@@ -0,0 +1,537 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Loading GGUF file: mmproj-F16.gguf
2
+ Found 316 tensors
3
+ Converting v.blk.0.attn_out.bias to FP16
4
+ v.blk.0.attn_out.bias -> model.layers.0.self_attn.o_proj.bias shape: (1024,)
5
+ v.blk.0.attn_out.weight -> model.layers.0.self_attn.o_proj.weight shape: (1024, 1024)
6
+ Converting v.blk.0.attn_qkv.bias to FP16
7
+ v.blk.0.attn_qkv.bias -> model.layers.0.self_attn.qkv_proj.bias shape: (3072,)
8
+ v.blk.0.attn_qkv.weight -> model.layers.0.self_attn.qkv_proj.weight shape: (3072, 1024)
9
+ Converting v.blk.0.ffn_up.bias to FP16
10
+ v.blk.0.ffn_up.bias -> model.layers.0.mlp.up_proj.bias shape: (4096,)
11
+ v.blk.0.ffn_up.weight -> model.layers.0.mlp.up_proj.weight shape: (4096, 1024)
12
+ Converting v.blk.0.ffn_down.bias to FP16
13
+ v.blk.0.ffn_down.bias -> model.layers.0.mlp.down_proj.bias shape: (1024,)
14
+ v.blk.0.ffn_down.weight -> model.layers.0.mlp.down_proj.weight shape: (1024, 4096)
15
+ Converting v.blk.0.ln1.bias to FP16
16
+ v.blk.0.ln1.bias -> model.layers.0.ln1.bias shape: (1024,)
17
+ Converting v.blk.0.ln1.weight to FP16
18
+ v.blk.0.ln1.weight -> model.layers.0.ln1.weight shape: (1024,)
19
+ Converting v.blk.0.ln2.bias to FP16
20
+ v.blk.0.ln2.bias -> model.layers.0.ln2.bias shape: (1024,)
21
+ Converting v.blk.0.ln2.weight to FP16
22
+ v.blk.0.ln2.weight -> model.layers.0.ln2.weight shape: (1024,)
23
+ Converting v.blk.1.attn_out.bias to FP16
24
+ v.blk.1.attn_out.bias -> model.layers.1.self_attn.o_proj.bias shape: (1024,)
25
+ v.blk.1.attn_out.weight -> model.layers.1.self_attn.o_proj.weight shape: (1024, 1024)
26
+ Converting v.blk.1.attn_qkv.bias to FP16
27
+ v.blk.1.attn_qkv.bias -> model.layers.1.self_attn.qkv_proj.bias shape: (3072,)
28
+ v.blk.1.attn_qkv.weight -> model.layers.1.self_attn.qkv_proj.weight shape: (3072, 1024)
29
+ Converting v.blk.1.ffn_up.bias to FP16
30
+ v.blk.1.ffn_up.bias -> model.layers.1.mlp.up_proj.bias shape: (4096,)
31
+ v.blk.1.ffn_up.weight -> model.layers.1.mlp.up_proj.weight shape: (4096, 1024)
32
+ Converting v.blk.1.ffn_down.bias to FP16
33
+ v.blk.1.ffn_down.bias -> model.layers.1.mlp.down_proj.bias shape: (1024,)
34
+ v.blk.1.ffn_down.weight -> model.layers.1.mlp.down_proj.weight shape: (1024, 4096)
35
+ Converting v.blk.1.ln1.bias to FP16
36
+ v.blk.1.ln1.bias -> model.layers.1.ln1.bias shape: (1024,)
37
+ Converting v.blk.1.ln1.weight to FP16
38
+ v.blk.1.ln1.weight -> model.layers.1.ln1.weight shape: (1024,)
39
+ Converting v.blk.1.ln2.bias to FP16
40
+ v.blk.1.ln2.bias -> model.layers.1.ln2.bias shape: (1024,)
41
+ Converting v.blk.1.ln2.weight to FP16
42
+ v.blk.1.ln2.weight -> model.layers.1.ln2.weight shape: (1024,)
43
+ Converting v.blk.10.attn_out.bias to FP16
44
+ v.blk.10.attn_out.bias -> model.layers.10.self_attn.o_proj.bias shape: (1024,)
45
+ v.blk.10.attn_out.weight -> model.layers.10.self_attn.o_proj.weight shape: (1024, 1024)
46
+ Converting v.blk.10.attn_qkv.bias to FP16
47
+ v.blk.10.attn_qkv.bias -> model.layers.10.self_attn.qkv_proj.bias shape: (3072,)
48
+ v.blk.10.attn_qkv.weight -> model.layers.10.self_attn.qkv_proj.weight shape: (3072, 1024)
49
+ Converting v.blk.10.ffn_up.bias to FP16
50
+ v.blk.10.ffn_up.bias -> model.layers.10.mlp.up_proj.bias shape: (4096,)
51
+ v.blk.10.ffn_up.weight -> model.layers.10.mlp.up_proj.weight shape: (4096, 1024)
52
+ Converting v.blk.10.ffn_down.bias to FP16
53
+ v.blk.10.ffn_down.bias -> model.layers.10.mlp.down_proj.bias shape: (1024,)
54
+ v.blk.10.ffn_down.weight -> model.layers.10.mlp.down_proj.weight shape: (1024, 4096)
55
+ Converting v.blk.10.ln1.bias to FP16
56
+ v.blk.10.ln1.bias -> model.layers.10.ln1.bias shape: (1024,)
57
+ Converting v.blk.10.ln1.weight to FP16
58
+ v.blk.10.ln1.weight -> model.layers.10.ln1.weight shape: (1024,)
59
+ Converting v.blk.10.ln2.bias to FP16
60
+ v.blk.10.ln2.bias -> model.layers.10.ln2.bias shape: (1024,)
61
+ Converting v.blk.10.ln2.weight to FP16
62
+ v.blk.10.ln2.weight -> model.layers.10.ln2.weight shape: (1024,)
63
+ Converting v.blk.11.attn_out.bias to FP16
64
+ v.blk.11.attn_out.bias -> model.layers.11.self_attn.o_proj.bias shape: (1024,)
65
+ v.blk.11.attn_out.weight -> model.layers.11.self_attn.o_proj.weight shape: (1024, 1024)
66
+ Converting v.blk.11.attn_qkv.bias to FP16
67
+ v.blk.11.attn_qkv.bias -> model.layers.11.self_attn.qkv_proj.bias shape: (3072,)
68
+ v.blk.11.attn_qkv.weight -> model.layers.11.self_attn.qkv_proj.weight shape: (3072, 1024)
69
+ Converting v.blk.11.ffn_up.bias to FP16
70
+ v.blk.11.ffn_up.bias -> model.layers.11.mlp.up_proj.bias shape: (4096,)
71
+ v.blk.11.ffn_up.weight -> model.layers.11.mlp.up_proj.weight shape: (4096, 1024)
72
+ Converting v.blk.11.ffn_down.bias to FP16
73
+ v.blk.11.ffn_down.bias -> model.layers.11.mlp.down_proj.bias shape: (1024,)
74
+ v.blk.11.ffn_down.weight -> model.layers.11.mlp.down_proj.weight shape: (1024, 4096)
75
+ Converting v.blk.11.ln1.bias to FP16
76
+ v.blk.11.ln1.bias -> model.layers.11.ln1.bias shape: (1024,)
77
+ Converting v.blk.11.ln1.weight to FP16
78
+ v.blk.11.ln1.weight -> model.layers.11.ln1.weight shape: (1024,)
79
+ Converting v.blk.11.ln2.bias to FP16
80
+ v.blk.11.ln2.bias -> model.layers.11.ln2.bias shape: (1024,)
81
+ Converting v.blk.11.ln2.weight to FP16
82
+ v.blk.11.ln2.weight -> model.layers.11.ln2.weight shape: (1024,)
83
+ Converting v.blk.12.attn_out.bias to FP16
84
+ v.blk.12.attn_out.bias -> model.layers.12.self_attn.o_proj.bias shape: (1024,)
85
+ v.blk.12.attn_out.weight -> model.layers.12.self_attn.o_proj.weight shape: (1024, 1024)
86
+ Converting v.blk.12.attn_qkv.bias to FP16
87
+ v.blk.12.attn_qkv.bias -> model.layers.12.self_attn.qkv_proj.bias shape: (3072,)
88
+ v.blk.12.attn_qkv.weight -> model.layers.12.self_attn.qkv_proj.weight shape: (3072, 1024)
89
+ Converting v.blk.12.ffn_up.bias to FP16
90
+ v.blk.12.ffn_up.bias -> model.layers.12.mlp.up_proj.bias shape: (4096,)
91
+ v.blk.12.ffn_up.weight -> model.layers.12.mlp.up_proj.weight shape: (4096, 1024)
92
+ Converting v.blk.12.ffn_down.bias to FP16
93
+ v.blk.12.ffn_down.bias -> model.layers.12.mlp.down_proj.bias shape: (1024,)
94
+ v.blk.12.ffn_down.weight -> model.layers.12.mlp.down_proj.weight shape: (1024, 4096)
95
+ Converting v.blk.12.ln1.bias to FP16
96
+ v.blk.12.ln1.bias -> model.layers.12.ln1.bias shape: (1024,)
97
+ Converting v.blk.12.ln1.weight to FP16
98
+ v.blk.12.ln1.weight -> model.layers.12.ln1.weight shape: (1024,)
99
+ Converting v.blk.12.ln2.bias to FP16
100
+ v.blk.12.ln2.bias -> model.layers.12.ln2.bias shape: (1024,)
101
+ Converting v.blk.12.ln2.weight to FP16
102
+ v.blk.12.ln2.weight -> model.layers.12.ln2.weight shape: (1024,)
103
+ Converting v.blk.13.attn_out.bias to FP16
104
+ v.blk.13.attn_out.bias -> model.layers.13.self_attn.o_proj.bias shape: (1024,)
105
+ v.blk.13.attn_out.weight -> model.layers.13.self_attn.o_proj.weight shape: (1024, 1024)
106
+ Converting v.blk.13.attn_qkv.bias to FP16
107
+ v.blk.13.attn_qkv.bias -> model.layers.13.self_attn.qkv_proj.bias shape: (3072,)
108
+ v.blk.13.attn_qkv.weight -> model.layers.13.self_attn.qkv_proj.weight shape: (3072, 1024)
109
+ Converting v.blk.13.ffn_up.bias to FP16
110
+ v.blk.13.ffn_up.bias -> model.layers.13.mlp.up_proj.bias shape: (4096,)
111
+ v.blk.13.ffn_up.weight -> model.layers.13.mlp.up_proj.weight shape: (4096, 1024)
112
+ Converting v.blk.13.ffn_down.bias to FP16
113
+ v.blk.13.ffn_down.bias -> model.layers.13.mlp.down_proj.bias shape: (1024,)
114
+ v.blk.13.ffn_down.weight -> model.layers.13.mlp.down_proj.weight shape: (1024, 4096)
115
+ Converting v.blk.13.ln1.bias to FP16
116
+ v.blk.13.ln1.bias -> model.layers.13.ln1.bias shape: (1024,)
117
+ Converting v.blk.13.ln1.weight to FP16
118
+ v.blk.13.ln1.weight -> model.layers.13.ln1.weight shape: (1024,)
119
+ Converting v.blk.13.ln2.bias to FP16
120
+ v.blk.13.ln2.bias -> model.layers.13.ln2.bias shape: (1024,)
121
+ Converting v.blk.13.ln2.weight to FP16
122
+ v.blk.13.ln2.weight -> model.layers.13.ln2.weight shape: (1024,)
123
+ Converting v.blk.14.attn_out.bias to FP16
124
+ v.blk.14.attn_out.bias -> model.layers.14.self_attn.o_proj.bias shape: (1024,)
125
+ v.blk.14.attn_out.weight -> model.layers.14.self_attn.o_proj.weight shape: (1024, 1024)
126
+ Converting v.blk.14.attn_qkv.bias to FP16
127
+ v.blk.14.attn_qkv.bias -> model.layers.14.self_attn.qkv_proj.bias shape: (3072,)
128
+ v.blk.14.attn_qkv.weight -> model.layers.14.self_attn.qkv_proj.weight shape: (3072, 1024)
129
+ Converting v.blk.14.ffn_up.bias to FP16
130
+ v.blk.14.ffn_up.bias -> model.layers.14.mlp.up_proj.bias shape: (4096,)
131
+ v.blk.14.ffn_up.weight -> model.layers.14.mlp.up_proj.weight shape: (4096, 1024)
132
+ Converting v.blk.14.ffn_down.bias to FP16
133
+ v.blk.14.ffn_down.bias -> model.layers.14.mlp.down_proj.bias shape: (1024,)
134
+ v.blk.14.ffn_down.weight -> model.layers.14.mlp.down_proj.weight shape: (1024, 4096)
135
+ Converting v.blk.14.ln1.bias to FP16
136
+ v.blk.14.ln1.bias -> model.layers.14.ln1.bias shape: (1024,)
137
+ Converting v.blk.14.ln1.weight to FP16
138
+ v.blk.14.ln1.weight -> model.layers.14.ln1.weight shape: (1024,)
139
+ Converting v.blk.14.ln2.bias to FP16
140
+ v.blk.14.ln2.bias -> model.layers.14.ln2.bias shape: (1024,)
141
+ Converting v.blk.14.ln2.weight to FP16
142
+ v.blk.14.ln2.weight -> model.layers.14.ln2.weight shape: (1024,)
143
+ Converting v.blk.15.attn_out.bias to FP16
144
+ v.blk.15.attn_out.bias -> model.layers.15.self_attn.o_proj.bias shape: (1024,)
145
+ v.blk.15.attn_out.weight -> model.layers.15.self_attn.o_proj.weight shape: (1024, 1024)
146
+ Converting v.blk.15.attn_qkv.bias to FP16
147
+ v.blk.15.attn_qkv.bias -> model.layers.15.self_attn.qkv_proj.bias shape: (3072,)
148
+ v.blk.15.attn_qkv.weight -> model.layers.15.self_attn.qkv_proj.weight shape: (3072, 1024)
149
+ Converting v.blk.15.ffn_up.bias to FP16
150
+ v.blk.15.ffn_up.bias -> model.layers.15.mlp.up_proj.bias shape: (4096,)
151
+ v.blk.15.ffn_up.weight -> model.layers.15.mlp.up_proj.weight shape: (4096, 1024)
152
+ Converting v.blk.15.ffn_down.bias to FP16
153
+ v.blk.15.ffn_down.bias -> model.layers.15.mlp.down_proj.bias shape: (1024,)
154
+ v.blk.15.ffn_down.weight -> model.layers.15.mlp.down_proj.weight shape: (1024, 4096)
155
+ Converting v.blk.15.ln1.bias to FP16
156
+ v.blk.15.ln1.bias -> model.layers.15.ln1.bias shape: (1024,)
157
+ Converting v.blk.15.ln1.weight to FP16
158
+ v.blk.15.ln1.weight -> model.layers.15.ln1.weight shape: (1024,)
159
+ Converting v.blk.15.ln2.bias to FP16
160
+ v.blk.15.ln2.bias -> model.layers.15.ln2.bias shape: (1024,)
161
+ Converting v.blk.15.ln2.weight to FP16
162
+ v.blk.15.ln2.weight -> model.layers.15.ln2.weight shape: (1024,)
163
+ Converting v.blk.16.attn_out.bias to FP16
164
+ v.blk.16.attn_out.bias -> model.layers.16.self_attn.o_proj.bias shape: (1024,)
165
+ v.blk.16.attn_out.weight -> model.layers.16.self_attn.o_proj.weight shape: (1024, 1024)
166
+ Converting v.blk.16.attn_qkv.bias to FP16
167
+ v.blk.16.attn_qkv.bias -> model.layers.16.self_attn.qkv_proj.bias shape: (3072,)
168
+ v.blk.16.attn_qkv.weight -> model.layers.16.self_attn.qkv_proj.weight shape: (3072, 1024)
169
+ Converting v.blk.16.ffn_up.bias to FP16
170
+ v.blk.16.ffn_up.bias -> model.layers.16.mlp.up_proj.bias shape: (4096,)
171
+ v.blk.16.ffn_up.weight -> model.layers.16.mlp.up_proj.weight shape: (4096, 1024)
172
+ Converting v.blk.16.ffn_down.bias to FP16
173
+ v.blk.16.ffn_down.bias -> model.layers.16.mlp.down_proj.bias shape: (1024,)
174
+ v.blk.16.ffn_down.weight -> model.layers.16.mlp.down_proj.weight shape: (1024, 4096)
175
+ Converting v.blk.16.ln1.bias to FP16
176
+ v.blk.16.ln1.bias -> model.layers.16.ln1.bias shape: (1024,)
177
+ Converting v.blk.16.ln1.weight to FP16
178
+ v.blk.16.ln1.weight -> model.layers.16.ln1.weight shape: (1024,)
179
+ Converting v.blk.16.ln2.bias to FP16
180
+ v.blk.16.ln2.bias -> model.layers.16.ln2.bias shape: (1024,)
181
+ Converting v.blk.16.ln2.weight to FP16
182
+ v.blk.16.ln2.weight -> model.layers.16.ln2.weight shape: (1024,)
183
+ Converting v.blk.17.attn_out.bias to FP16
184
+ v.blk.17.attn_out.bias -> model.layers.17.self_attn.o_proj.bias shape: (1024,)
185
+ v.blk.17.attn_out.weight -> model.layers.17.self_attn.o_proj.weight shape: (1024, 1024)
186
+ Converting v.blk.17.attn_qkv.bias to FP16
187
+ v.blk.17.attn_qkv.bias -> model.layers.17.self_attn.qkv_proj.bias shape: (3072,)
188
+ v.blk.17.attn_qkv.weight -> model.layers.17.self_attn.qkv_proj.weight shape: (3072, 1024)
189
+ Converting v.blk.17.ffn_up.bias to FP16
190
+ v.blk.17.ffn_up.bias -> model.layers.17.mlp.up_proj.bias shape: (4096,)
191
+ v.blk.17.ffn_up.weight -> model.layers.17.mlp.up_proj.weight shape: (4096, 1024)
192
+ Converting v.blk.17.ffn_down.bias to FP16
193
+ v.blk.17.ffn_down.bias -> model.layers.17.mlp.down_proj.bias shape: (1024,)
194
+ v.blk.17.ffn_down.weight -> model.layers.17.mlp.down_proj.weight shape: (1024, 4096)
195
+ Converting v.blk.17.ln1.bias to FP16
196
+ v.blk.17.ln1.bias -> model.layers.17.ln1.bias shape: (1024,)
197
+ Converting v.blk.17.ln1.weight to FP16
198
+ v.blk.17.ln1.weight -> model.layers.17.ln1.weight shape: (1024,)
199
+ Converting v.blk.17.ln2.bias to FP16
200
+ v.blk.17.ln2.bias -> model.layers.17.ln2.bias shape: (1024,)
201
+ Converting v.blk.17.ln2.weight to FP16
202
+ v.blk.17.ln2.weight -> model.layers.17.ln2.weight shape: (1024,)
203
+ Converting v.blk.18.attn_out.bias to FP16
204
+ v.blk.18.attn_out.bias -> model.layers.18.self_attn.o_proj.bias shape: (1024,)
205
+ v.blk.18.attn_out.weight -> model.layers.18.self_attn.o_proj.weight shape: (1024, 1024)
206
+ Converting v.blk.18.attn_qkv.bias to FP16
207
+ v.blk.18.attn_qkv.bias -> model.layers.18.self_attn.qkv_proj.bias shape: (3072,)
208
+ v.blk.18.attn_qkv.weight -> model.layers.18.self_attn.qkv_proj.weight shape: (3072, 1024)
209
+ Converting v.blk.18.ffn_up.bias to FP16
210
+ v.blk.18.ffn_up.bias -> model.layers.18.mlp.up_proj.bias shape: (4096,)
211
+ v.blk.18.ffn_up.weight -> model.layers.18.mlp.up_proj.weight shape: (4096, 1024)
212
+ Converting v.blk.18.ffn_down.bias to FP16
213
+ v.blk.18.ffn_down.bias -> model.layers.18.mlp.down_proj.bias shape: (1024,)
214
+ v.blk.18.ffn_down.weight -> model.layers.18.mlp.down_proj.weight shape: (1024, 4096)
215
+ Converting v.blk.18.ln1.bias to FP16
216
+ v.blk.18.ln1.bias -> model.layers.18.ln1.bias shape: (1024,)
217
+ Converting v.blk.18.ln1.weight to FP16
218
+ v.blk.18.ln1.weight -> model.layers.18.ln1.weight shape: (1024,)
219
+ Converting v.blk.18.ln2.bias to FP16
220
+ v.blk.18.ln2.bias -> model.layers.18.ln2.bias shape: (1024,)
221
+ Converting v.blk.18.ln2.weight to FP16
222
+ v.blk.18.ln2.weight -> model.layers.18.ln2.weight shape: (1024,)
223
+ Converting v.blk.19.attn_out.bias to FP16
224
+ v.blk.19.attn_out.bias -> model.layers.19.self_attn.o_proj.bias shape: (1024,)
225
+ v.blk.19.attn_out.weight -> model.layers.19.self_attn.o_proj.weight shape: (1024, 1024)
226
+ Converting v.blk.19.attn_qkv.bias to FP16
227
+ v.blk.19.attn_qkv.bias -> model.layers.19.self_attn.qkv_proj.bias shape: (3072,)
228
+ v.blk.19.attn_qkv.weight -> model.layers.19.self_attn.qkv_proj.weight shape: (3072, 1024)
229
+ Converting v.blk.19.ffn_up.bias to FP16
230
+ v.blk.19.ffn_up.bias -> model.layers.19.mlp.up_proj.bias shape: (4096,)
231
+ v.blk.19.ffn_up.weight -> model.layers.19.mlp.up_proj.weight shape: (4096, 1024)
232
+ Converting v.blk.19.ffn_down.bias to FP16
233
+ v.blk.19.ffn_down.bias -> model.layers.19.mlp.down_proj.bias shape: (1024,)
234
+ v.blk.19.ffn_down.weight -> model.layers.19.mlp.down_proj.weight shape: (1024, 4096)
235
+ Converting v.blk.19.ln1.bias to FP16
236
+ v.blk.19.ln1.bias -> model.layers.19.ln1.bias shape: (1024,)
237
+ Converting v.blk.19.ln1.weight to FP16
238
+ v.blk.19.ln1.weight -> model.layers.19.ln1.weight shape: (1024,)
239
+ Converting v.blk.19.ln2.bias to FP16
240
+ v.blk.19.ln2.bias -> model.layers.19.ln2.bias shape: (1024,)
241
+ Converting v.blk.19.ln2.weight to FP16
242
+ v.blk.19.ln2.weight -> model.layers.19.ln2.weight shape: (1024,)
243
+ Converting v.blk.2.attn_out.bias to FP16
244
+ v.blk.2.attn_out.bias -> model.layers.2.self_attn.o_proj.bias shape: (1024,)
245
+ v.blk.2.attn_out.weight -> model.layers.2.self_attn.o_proj.weight shape: (1024, 1024)
246
+ Converting v.blk.2.attn_qkv.bias to FP16
247
+ v.blk.2.attn_qkv.bias -> model.layers.2.self_attn.qkv_proj.bias shape: (3072,)
248
+ v.blk.2.attn_qkv.weight -> model.layers.2.self_attn.qkv_proj.weight shape: (3072, 1024)
249
+ Converting v.blk.2.ffn_up.bias to FP16
250
+ v.blk.2.ffn_up.bias -> model.layers.2.mlp.up_proj.bias shape: (4096,)
251
+ v.blk.2.ffn_up.weight -> model.layers.2.mlp.up_proj.weight shape: (4096, 1024)
252
+ Converting v.blk.2.ffn_down.bias to FP16
253
+ v.blk.2.ffn_down.bias -> model.layers.2.mlp.down_proj.bias shape: (1024,)
254
+ v.blk.2.ffn_down.weight -> model.layers.2.mlp.down_proj.weight shape: (1024, 4096)
255
+ Converting v.blk.2.ln1.bias to FP16
256
+ v.blk.2.ln1.bias -> model.layers.2.ln1.bias shape: (1024,)
257
+ Converting v.blk.2.ln1.weight to FP16
258
+ v.blk.2.ln1.weight -> model.layers.2.ln1.weight shape: (1024,)
259
+ Converting v.blk.2.ln2.bias to FP16
260
+ v.blk.2.ln2.bias -> model.layers.2.ln2.bias shape: (1024,)
261
+ Converting v.blk.2.ln2.weight to FP16
262
+ v.blk.2.ln2.weight -> model.layers.2.ln2.weight shape: (1024,)
263
+ Converting v.blk.20.attn_out.bias to FP16
264
+ v.blk.20.attn_out.bias -> model.layers.20.self_attn.o_proj.bias shape: (1024,)
265
+ v.blk.20.attn_out.weight -> model.layers.20.self_attn.o_proj.weight shape: (1024, 1024)
266
+ Converting v.blk.20.attn_qkv.bias to FP16
267
+ v.blk.20.attn_qkv.bias -> model.layers.20.self_attn.qkv_proj.bias shape: (3072,)
268
+ v.blk.20.attn_qkv.weight -> model.layers.20.self_attn.qkv_proj.weight shape: (3072, 1024)
269
+ Converting v.blk.20.ffn_up.bias to FP16
270
+ v.blk.20.ffn_up.bias -> model.layers.20.mlp.up_proj.bias shape: (4096,)
271
+ v.blk.20.ffn_up.weight -> model.layers.20.mlp.up_proj.weight shape: (4096, 1024)
272
+ Converting v.blk.20.ffn_down.bias to FP16
273
+ v.blk.20.ffn_down.bias -> model.layers.20.mlp.down_proj.bias shape: (1024,)
274
+ v.blk.20.ffn_down.weight -> model.layers.20.mlp.down_proj.weight shape: (1024, 4096)
275
+ Converting v.blk.20.ln1.bias to FP16
276
+ v.blk.20.ln1.bias -> model.layers.20.ln1.bias shape: (1024,)
277
+ Converting v.blk.20.ln1.weight to FP16
278
+ v.blk.20.ln1.weight -> model.layers.20.ln1.weight shape: (1024,)
279
+ Converting v.blk.20.ln2.bias to FP16
280
+ v.blk.20.ln2.bias -> model.layers.20.ln2.bias shape: (1024,)
281
+ Converting v.blk.20.ln2.weight to FP16
282
+ v.blk.20.ln2.weight -> model.layers.20.ln2.weight shape: (1024,)
283
+ Converting v.blk.21.attn_out.bias to FP16
284
+ v.blk.21.attn_out.bias -> model.layers.21.self_attn.o_proj.bias shape: (1024,)
285
+ v.blk.21.attn_out.weight -> model.layers.21.self_attn.o_proj.weight shape: (1024, 1024)
286
+ Converting v.blk.21.attn_qkv.bias to FP16
287
+ v.blk.21.attn_qkv.bias -> model.layers.21.self_attn.qkv_proj.bias shape: (3072,)
288
+ v.blk.21.attn_qkv.weight -> model.layers.21.self_attn.qkv_proj.weight shape: (3072, 1024)
289
+ Converting v.blk.21.ffn_up.bias to FP16
290
+ v.blk.21.ffn_up.bias -> model.layers.21.mlp.up_proj.bias shape: (4096,)
291
+ v.blk.21.ffn_up.weight -> model.layers.21.mlp.up_proj.weight shape: (4096, 1024)
292
+ Converting v.blk.21.ffn_down.bias to FP16
293
+ v.blk.21.ffn_down.bias -> model.layers.21.mlp.down_proj.bias shape: (1024,)
294
+ v.blk.21.ffn_down.weight -> model.layers.21.mlp.down_proj.weight shape: (1024, 4096)
295
+ Converting v.blk.21.ln1.bias to FP16
296
+ v.blk.21.ln1.bias -> model.layers.21.ln1.bias shape: (1024,)
297
+ Converting v.blk.21.ln1.weight to FP16
298
+ v.blk.21.ln1.weight -> model.layers.21.ln1.weight shape: (1024,)
299
+ Converting v.blk.21.ln2.bias to FP16
300
+ v.blk.21.ln2.bias -> model.layers.21.ln2.bias shape: (1024,)
301
+ Converting v.blk.21.ln2.weight to FP16
302
+ v.blk.21.ln2.weight -> model.layers.21.ln2.weight shape: (1024,)
303
+ Converting v.blk.22.attn_out.bias to FP16
304
+ v.blk.22.attn_out.bias -> model.layers.22.self_attn.o_proj.bias shape: (1024,)
305
+ v.blk.22.attn_out.weight -> model.layers.22.self_attn.o_proj.weight shape: (1024, 1024)
306
+ Converting v.blk.22.attn_qkv.bias to FP16
307
+ v.blk.22.attn_qkv.bias -> model.layers.22.self_attn.qkv_proj.bias shape: (3072,)
308
+ v.blk.22.attn_qkv.weight -> model.layers.22.self_attn.qkv_proj.weight shape: (3072, 1024)
309
+ Converting v.blk.22.ffn_up.bias to FP16
310
+ v.blk.22.ffn_up.bias -> model.layers.22.mlp.up_proj.bias shape: (4096,)
311
+ v.blk.22.ffn_up.weight -> model.layers.22.mlp.up_proj.weight shape: (4096, 1024)
312
+ Converting v.blk.22.ffn_down.bias to FP16
313
+ v.blk.22.ffn_down.bias -> model.layers.22.mlp.down_proj.bias shape: (1024,)
314
+ v.blk.22.ffn_down.weight -> model.layers.22.mlp.down_proj.weight shape: (1024, 4096)
315
+ Converting v.blk.22.ln1.bias to FP16
316
+ v.blk.22.ln1.bias -> model.layers.22.ln1.bias shape: (1024,)
317
+ Converting v.blk.22.ln1.weight to FP16
318
+ v.blk.22.ln1.weight -> model.layers.22.ln1.weight shape: (1024,)
319
+ Converting v.blk.22.ln2.bias to FP16
320
+ v.blk.22.ln2.bias -> model.layers.22.ln2.bias shape: (1024,)
321
+ Converting v.blk.22.ln2.weight to FP16
322
+ v.blk.22.ln2.weight -> model.layers.22.ln2.weight shape: (1024,)
323
+ Converting v.blk.23.attn_out.bias to FP16
324
+ v.blk.23.attn_out.bias -> model.layers.23.self_attn.o_proj.bias shape: (1024,)
325
+ v.blk.23.attn_out.weight -> model.layers.23.self_attn.o_proj.weight shape: (1024, 1024)
326
+ Converting v.blk.23.attn_qkv.bias to FP16
327
+ v.blk.23.attn_qkv.bias -> model.layers.23.self_attn.qkv_proj.bias shape: (3072,)
328
+ v.blk.23.attn_qkv.weight -> model.layers.23.self_attn.qkv_proj.weight shape: (3072, 1024)
329
+ Converting v.blk.23.ffn_up.bias to FP16
330
+ v.blk.23.ffn_up.bias -> model.layers.23.mlp.up_proj.bias shape: (4096,)
331
+ v.blk.23.ffn_up.weight -> model.layers.23.mlp.up_proj.weight shape: (4096, 1024)
332
+ Converting v.blk.23.ffn_down.bias to FP16
333
+ v.blk.23.ffn_down.bias -> model.layers.23.mlp.down_proj.bias shape: (1024,)
334
+ v.blk.23.ffn_down.weight -> model.layers.23.mlp.down_proj.weight shape: (1024, 4096)
335
+ Converting v.blk.23.ln1.bias to FP16
336
+ v.blk.23.ln1.bias -> model.layers.23.ln1.bias shape: (1024,)
337
+ Converting v.blk.23.ln1.weight to FP16
338
+ v.blk.23.ln1.weight -> model.layers.23.ln1.weight shape: (1024,)
339
+ Converting v.blk.23.ln2.bias to FP16
340
+ v.blk.23.ln2.bias -> model.layers.23.ln2.bias shape: (1024,)
341
+ Converting v.blk.23.ln2.weight to FP16
342
+ v.blk.23.ln2.weight -> model.layers.23.ln2.weight shape: (1024,)
343
+ Converting v.blk.3.attn_out.bias to FP16
344
+ v.blk.3.attn_out.bias -> model.layers.3.self_attn.o_proj.bias shape: (1024,)
345
+ v.blk.3.attn_out.weight -> model.layers.3.self_attn.o_proj.weight shape: (1024, 1024)
346
+ Converting v.blk.3.attn_qkv.bias to FP16
347
+ v.blk.3.attn_qkv.bias -> model.layers.3.self_attn.qkv_proj.bias shape: (3072,)
348
+ v.blk.3.attn_qkv.weight -> model.layers.3.self_attn.qkv_proj.weight shape: (3072, 1024)
349
+ Converting v.blk.3.ffn_up.bias to FP16
350
+ v.blk.3.ffn_up.bias -> model.layers.3.mlp.up_proj.bias shape: (4096,)
351
+ v.blk.3.ffn_up.weight -> model.layers.3.mlp.up_proj.weight shape: (4096, 1024)
352
+ Converting v.blk.3.ffn_down.bias to FP16
353
+ v.blk.3.ffn_down.bias -> model.layers.3.mlp.down_proj.bias shape: (1024,)
354
+ v.blk.3.ffn_down.weight -> model.layers.3.mlp.down_proj.weight shape: (1024, 4096)
355
+ Converting v.blk.3.ln1.bias to FP16
356
+ v.blk.3.ln1.bias -> model.layers.3.ln1.bias shape: (1024,)
357
+ Converting v.blk.3.ln1.weight to FP16
358
+ v.blk.3.ln1.weight -> model.layers.3.ln1.weight shape: (1024,)
359
+ Converting v.blk.3.ln2.bias to FP16
360
+ v.blk.3.ln2.bias -> model.layers.3.ln2.bias shape: (1024,)
361
+ Converting v.blk.3.ln2.weight to FP16
362
+ v.blk.3.ln2.weight -> model.layers.3.ln2.weight shape: (1024,)
363
+ Converting v.blk.4.attn_out.bias to FP16
364
+ v.blk.4.attn_out.bias -> model.layers.4.self_attn.o_proj.bias shape: (1024,)
365
+ v.blk.4.attn_out.weight -> model.layers.4.self_attn.o_proj.weight shape: (1024, 1024)
366
+ Converting v.blk.4.attn_qkv.bias to FP16
367
+ v.blk.4.attn_qkv.bias -> model.layers.4.self_attn.qkv_proj.bias shape: (3072,)
368
+ v.blk.4.attn_qkv.weight -> model.layers.4.self_attn.qkv_proj.weight shape: (3072, 1024)
369
+ Converting v.blk.4.ffn_up.bias to FP16
370
+ v.blk.4.ffn_up.bias -> model.layers.4.mlp.up_proj.bias shape: (4096,)
371
+ v.blk.4.ffn_up.weight -> model.layers.4.mlp.up_proj.weight shape: (4096, 1024)
372
+ Converting v.blk.4.ffn_down.bias to FP16
373
+ v.blk.4.ffn_down.bias -> model.layers.4.mlp.down_proj.bias shape: (1024,)
374
+ v.blk.4.ffn_down.weight -> model.layers.4.mlp.down_proj.weight shape: (1024, 4096)
375
+ Converting v.blk.4.ln1.bias to FP16
376
+ v.blk.4.ln1.bias -> model.layers.4.ln1.bias shape: (1024,)
377
+ Converting v.blk.4.ln1.weight to FP16
378
+ v.blk.4.ln1.weight -> model.layers.4.ln1.weight shape: (1024,)
379
+ Converting v.blk.4.ln2.bias to FP16
380
+ v.blk.4.ln2.bias -> model.layers.4.ln2.bias shape: (1024,)
381
+ Converting v.blk.4.ln2.weight to FP16
382
+ v.blk.4.ln2.weight -> model.layers.4.ln2.weight shape: (1024,)
383
+ Converting v.blk.5.attn_out.bias to FP16
384
+ v.blk.5.attn_out.bias -> model.layers.5.self_attn.o_proj.bias shape: (1024,)
385
+ v.blk.5.attn_out.weight -> model.layers.5.self_attn.o_proj.weight shape: (1024, 1024)
386
+ Converting v.blk.5.attn_qkv.bias to FP16
387
+ v.blk.5.attn_qkv.bias -> model.layers.5.self_attn.qkv_proj.bias shape: (3072,)
388
+ v.blk.5.attn_qkv.weight -> model.layers.5.self_attn.qkv_proj.weight shape: (3072, 1024)
389
+ Converting v.blk.5.ffn_up.bias to FP16
390
+ v.blk.5.ffn_up.bias -> model.layers.5.mlp.up_proj.bias shape: (4096,)
391
+ v.blk.5.ffn_up.weight -> model.layers.5.mlp.up_proj.weight shape: (4096, 1024)
392
+ Converting v.blk.5.ffn_down.bias to FP16
393
+ v.blk.5.ffn_down.bias -> model.layers.5.mlp.down_proj.bias shape: (1024,)
394
+ v.blk.5.ffn_down.weight -> model.layers.5.mlp.down_proj.weight shape: (1024, 4096)
395
+ Converting v.blk.5.ln1.bias to FP16
396
+ v.blk.5.ln1.bias -> model.layers.5.ln1.bias shape: (1024,)
397
+ Converting v.blk.5.ln1.weight to FP16
398
+ v.blk.5.ln1.weight -> model.layers.5.ln1.weight shape: (1024,)
399
+ Converting v.blk.5.ln2.bias to FP16
400
+ v.blk.5.ln2.bias -> model.layers.5.ln2.bias shape: (1024,)
401
+ Converting v.blk.5.ln2.weight to FP16
402
+ v.blk.5.ln2.weight -> model.layers.5.ln2.weight shape: (1024,)
403
+ Converting v.blk.6.attn_out.bias to FP16
404
+ v.blk.6.attn_out.bias -> model.layers.6.self_attn.o_proj.bias shape: (1024,)
405
+ v.blk.6.attn_out.weight -> model.layers.6.self_attn.o_proj.weight shape: (1024, 1024)
406
+ Converting v.blk.6.attn_qkv.bias to FP16
407
+ v.blk.6.attn_qkv.bias -> model.layers.6.self_attn.qkv_proj.bias shape: (3072,)
408
+ v.blk.6.attn_qkv.weight -> model.layers.6.self_attn.qkv_proj.weight shape: (3072, 1024)
409
+ Converting v.blk.6.ffn_up.bias to FP16
410
+ v.blk.6.ffn_up.bias -> model.layers.6.mlp.up_proj.bias shape: (4096,)
411
+ v.blk.6.ffn_up.weight -> model.layers.6.mlp.up_proj.weight shape: (4096, 1024)
412
+ Converting v.blk.6.ffn_down.bias to FP16
413
+ v.blk.6.ffn_down.bias -> model.layers.6.mlp.down_proj.bias shape: (1024,)
414
+ v.blk.6.ffn_down.weight -> model.layers.6.mlp.down_proj.weight shape: (1024, 4096)
415
+ Converting v.blk.6.ln1.bias to FP16
416
+ v.blk.6.ln1.bias -> model.layers.6.ln1.bias shape: (1024,)
417
+ Converting v.blk.6.ln1.weight to FP16
418
+ v.blk.6.ln1.weight -> model.layers.6.ln1.weight shape: (1024,)
419
+ Converting v.blk.6.ln2.bias to FP16
420
+ v.blk.6.ln2.bias -> model.layers.6.ln2.bias shape: (1024,)
421
+ Converting v.blk.6.ln2.weight to FP16
422
+ v.blk.6.ln2.weight -> model.layers.6.ln2.weight shape: (1024,)
423
+ Converting v.blk.7.attn_out.bias to FP16
424
+ v.blk.7.attn_out.bias -> model.layers.7.self_attn.o_proj.bias shape: (1024,)
425
+ v.blk.7.attn_out.weight -> model.layers.7.self_attn.o_proj.weight shape: (1024, 1024)
426
+ Converting v.blk.7.attn_qkv.bias to FP16
427
+ v.blk.7.attn_qkv.bias -> model.layers.7.self_attn.qkv_proj.bias shape: (3072,)
428
+ v.blk.7.attn_qkv.weight -> model.layers.7.self_attn.qkv_proj.weight shape: (3072, 1024)
429
+ Converting v.blk.7.ffn_up.bias to FP16
430
+ v.blk.7.ffn_up.bias -> model.layers.7.mlp.up_proj.bias shape: (4096,)
431
+ v.blk.7.ffn_up.weight -> model.layers.7.mlp.up_proj.weight shape: (4096, 1024)
432
+ Converting v.blk.7.ffn_down.bias to FP16
433
+ v.blk.7.ffn_down.bias -> model.layers.7.mlp.down_proj.bias shape: (1024,)
434
+ v.blk.7.ffn_down.weight -> model.layers.7.mlp.down_proj.weight shape: (1024, 4096)
435
+ Converting v.blk.7.ln1.bias to FP16
436
+ v.blk.7.ln1.bias -> model.layers.7.ln1.bias shape: (1024,)
437
+ Converting v.blk.7.ln1.weight to FP16
438
+ v.blk.7.ln1.weight -> model.layers.7.ln1.weight shape: (1024,)
439
+ Converting v.blk.7.ln2.bias to FP16
440
+ v.blk.7.ln2.bias -> model.layers.7.ln2.bias shape: (1024,)
441
+ Converting v.blk.7.ln2.weight to FP16
442
+ v.blk.7.ln2.weight -> model.layers.7.ln2.weight shape: (1024,)
443
+ Converting v.blk.8.attn_out.bias to FP16
444
+ v.blk.8.attn_out.bias -> model.layers.8.self_attn.o_proj.bias shape: (1024,)
445
+ v.blk.8.attn_out.weight -> model.layers.8.self_attn.o_proj.weight shape: (1024, 1024)
446
+ Converting v.blk.8.attn_qkv.bias to FP16
447
+ v.blk.8.attn_qkv.bias -> model.layers.8.self_attn.qkv_proj.bias shape: (3072,)
448
+ v.blk.8.attn_qkv.weight -> model.layers.8.self_attn.qkv_proj.weight shape: (3072, 1024)
449
+ Converting v.blk.8.ffn_up.bias to FP16
450
+ v.blk.8.ffn_up.bias -> model.layers.8.mlp.up_proj.bias shape: (4096,)
451
+ v.blk.8.ffn_up.weight -> model.layers.8.mlp.up_proj.weight shape: (4096, 1024)
452
+ Converting v.blk.8.ffn_down.bias to FP16
453
+ v.blk.8.ffn_down.bias -> model.layers.8.mlp.down_proj.bias shape: (1024,)
454
+ v.blk.8.ffn_down.weight -> model.layers.8.mlp.down_proj.weight shape: (1024, 4096)
455
+ Converting v.blk.8.ln1.bias to FP16
456
+ v.blk.8.ln1.bias -> model.layers.8.ln1.bias shape: (1024,)
457
+ Converting v.blk.8.ln1.weight to FP16
458
+ v.blk.8.ln1.weight -> model.layers.8.ln1.weight shape: (1024,)
459
+ Converting v.blk.8.ln2.bias to FP16
460
+ v.blk.8.ln2.bias -> model.layers.8.ln2.bias shape: (1024,)
461
+ Converting v.blk.8.ln2.weight to FP16
462
+ v.blk.8.ln2.weight -> model.layers.8.ln2.weight shape: (1024,)
463
+ Converting v.blk.9.attn_out.bias to FP16
464
+ v.blk.9.attn_out.bias -> model.layers.9.self_attn.o_proj.bias shape: (1024,)
465
+ v.blk.9.attn_out.weight -> model.layers.9.self_attn.o_proj.weight shape: (1024, 1024)
466
+ Converting v.blk.9.attn_qkv.bias to FP16
467
+ v.blk.9.attn_qkv.bias -> model.layers.9.self_attn.qkv_proj.bias shape: (3072,)
468
+ v.blk.9.attn_qkv.weight -> model.layers.9.self_attn.qkv_proj.weight shape: (3072, 1024)
469
+ Converting v.blk.9.ffn_up.bias to FP16
470
+ v.blk.9.ffn_up.bias -> model.layers.9.mlp.up_proj.bias shape: (4096,)
471
+ v.blk.9.ffn_up.weight -> model.layers.9.mlp.up_proj.weight shape: (4096, 1024)
472
+ Converting v.blk.9.ffn_down.bias to FP16
473
+ v.blk.9.ffn_down.bias -> model.layers.9.mlp.down_proj.bias shape: (1024,)
474
+ v.blk.9.ffn_down.weight -> model.layers.9.mlp.down_proj.weight shape: (1024, 4096)
475
+ Converting v.blk.9.ln1.bias to FP16
476
+ v.blk.9.ln1.bias -> model.layers.9.ln1.bias shape: (1024,)
477
+ Converting v.blk.9.ln1.weight to FP16
478
+ v.blk.9.ln1.weight -> model.layers.9.ln1.weight shape: (1024,)
479
+ Converting v.blk.9.ln2.bias to FP16
480
+ v.blk.9.ln2.bias -> model.layers.9.ln2.bias shape: (1024,)
481
+ Converting v.blk.9.ln2.weight to FP16
482
+ v.blk.9.ln2.weight -> model.layers.9.ln2.weight shape: (1024,)
483
+ Converting v.deepstack.5.fc1.bias to FP16
484
+ v.deepstack.5.fc1.bias -> model.deepstack.5.fc1.bias shape: (4096,)
485
+ v.deepstack.5.fc1.weight -> model.deepstack.5.fc1.weight shape: (4096, 4096)
486
+ Converting v.deepstack.5.fc2.bias to FP16
487
+ v.deepstack.5.fc2.bias -> model.deepstack.5.fc2.bias shape: (2560,)
488
+ v.deepstack.5.fc2.weight -> model.deepstack.5.fc2.weight shape: (2560, 4096)
489
+ Converting v.deepstack.5.norm.bias to FP16
490
+ v.deepstack.5.norm.bias -> model.deepstack.5.norm.bias shape: (4096,)
491
+ Converting v.deepstack.5.norm.weight to FP16
492
+ v.deepstack.5.norm.weight -> model.deepstack.5.norm.weight shape: (4096,)
493
+ Converting v.deepstack.11.fc1.bias to FP16
494
+ v.deepstack.11.fc1.bias -> model.deepstack.11.fc1.bias shape: (4096,)
495
+ v.deepstack.11.fc1.weight -> model.deepstack.11.fc1.weight shape: (4096, 4096)
496
+ Converting v.deepstack.11.fc2.bias to FP16
497
+ v.deepstack.11.fc2.bias -> model.deepstack.11.fc2.bias shape: (2560,)
498
+ v.deepstack.11.fc2.weight -> model.deepstack.11.fc2.weight shape: (2560, 4096)
499
+ Converting v.deepstack.11.norm.bias to FP16
500
+ v.deepstack.11.norm.bias -> model.deepstack.11.norm.bias shape: (4096,)
501
+ Converting v.deepstack.11.norm.weight to FP16
502
+ v.deepstack.11.norm.weight -> model.deepstack.11.norm.weight shape: (4096,)
503
+ Converting v.deepstack.17.fc1.bias to FP16
504
+ v.deepstack.17.fc1.bias -> model.deepstack.17.fc1.bias shape: (4096,)
505
+ v.deepstack.17.fc1.weight -> model.deepstack.17.fc1.weight shape: (4096, 4096)
506
+ Converting v.deepstack.17.fc2.bias to FP16
507
+ v.deepstack.17.fc2.bias -> model.deepstack.17.fc2.bias shape: (2560,)
508
+ v.deepstack.17.fc2.weight -> model.deepstack.17.fc2.weight shape: (2560, 4096)
509
+ Converting v.deepstack.17.norm.bias to FP16
510
+ v.deepstack.17.norm.bias -> model.deepstack.17.norm.bias shape: (4096,)
511
+ Converting v.deepstack.17.norm.weight to FP16
512
+ v.deepstack.17.norm.weight -> model.deepstack.17.norm.weight shape: (4096,)
513
+ Converting mm.0.bias to FP16
514
+ mm.0.bias -> model.mm.0.bias shape: (4096,)
515
+ mm.0.weight -> model.mm.0.weight shape: (4096, 4096)
516
+ Converting mm.2.bias to FP16
517
+ mm.2.bias -> model.mm.2.bias shape: (2560,)
518
+ mm.2.weight -> model.mm.2.weight shape: (2560, 4096)
519
+ Converting v.post_ln.bias to FP16
520
+ v.post_ln.bias -> model.post_ln.bias shape: (1024,)
521
+ Converting v.post_ln.weight to FP16
522
+ v.post_ln.weight -> model.post_ln.weight shape: (1024,)
523
+ Converting v.patch_embd.bias to FP16
524
+ v.patch_embd.bias -> model.patch_embd.bias shape: (1024,)
525
+ v.patch_embd.weight -> model.patch_embd.weight shape: (1024, 3, 16, 16)
526
+ v.patch_embd.weight.1 -> model.patch_embd.weight.1 shape: (1024, 3, 16, 16)
527
+ Converting v.position_embd.weight to FP16
528
+ v.position_embd.weight -> model.position_embd.weight shape: (2304, 1024)
529
+
530
+ Converted 316 tensors
531
+
532
+ All required tensors present!
533
+
534
+ Saving to qwen3-vl-4b/mmproj-fp16.npz...
535
+ Output file size: 0.60 GB
536
+
537
+ Conversion complete!
qwen3-vl-4b/model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3109e254938ff3d66477b10880bda5bf744bfc9be66c08755765fa5d67fad30c
3
+ size 8045090783
qwen3-vl-4b/out-fp16.npz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:978c95d567adbd14e55f2703dcabac4b1148b7c722ea839ab53eb879df1be8e5
3
+ size 6797257649
qwen3-vl-4b/out.npz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fa2c56277a413cb044389d87723af357c16d2cf3e4e4cbd120e63d177d9ec3b2
3
+ size 2927132949
qwen3-vl-4b/scale.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dcaf2e8164bc08707382108563f164c79a67aa315a10918eec8de7b4189e2f96
3
+ size 4566483
qwen3-vl-4b/smooth.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5a4772a5bc2c0594bade56d31693cee385c522bc1773278ca9c18dbec22f14f6
3
+ size 2773543
qwen3-vl-4b/wgts.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cdde16b81c020527f418e855715261a92f4aca0a994c54cbdb9dcb14f1bcbf3c
3
+ size 4519327