File size: 193,126 Bytes
c4a1ea1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 |
/home/cfruan/.conda/envs/mlc-source-311/bin/python -m mlc_llm gen_config /models/Meta-Llama-3-8B-Instruct --quantization q0f32 --conv-template llama-3 --output /tmp/tmpq8el2iww --context-window-size 8192 --prefill-chunk-size 1024 [2024-04-18 15:59:56] INFO auto_config.py:115: [92mFound[0m model configuration: /models/Meta-Llama-3-8B-Instruct/config.json [2024-04-18 15:59:56] INFO auto_config.py:153: [92mFound[0m model type: [1mllama[0m. Use `--model-type` to override. [2024-04-18 15:59:56] INFO llama_model.py:52: [1mcontext_window_size[0m not found in config.json. Falling back to [1mmax_position_embeddings[0m (8192) [2024-04-18 15:59:56] INFO llama_model.py:72: [1mprefill_chunk_size[0m defaults to [1mcontext_window_size[0m (8192) [2024-04-18 15:59:56] INFO config.py:106: Overriding [1mcontext_window_size[0m from 8192 to 8192 [2024-04-18 15:59:56] INFO config.py:106: Overriding [1mprefill_chunk_size[0m from 8192 to 1024 [2024-04-18 15:59:56] INFO config.py:106: Overriding [1mmax_batch_size[0m from 1 to 80 [2024-04-18 15:59:56] INFO gen_config.py:187: [generation_config.json] Setting [1mbos_token_id[0m: 128000 [2024-04-18 15:59:56] INFO gen_config.py:187: [generation_config.json] Setting [1meos_token_id[0m: 128001 [2024-04-18 15:59:56] INFO gen_config.py:201: [91mNot found[0m tokenizer config: /models/Meta-Llama-3-8B-Instruct/tokenizer.model [2024-04-18 15:59:56] INFO gen_config.py:199: [92mFound[0m tokenizer config: /models/Meta-Llama-3-8B-Instruct/tokenizer.json. Copying to [1m/tmp/tmpq8el2iww/tokenizer.json[0m [2024-04-18 15:59:56] INFO gen_config.py:201: [91mNot found[0m tokenizer config: /models/Meta-Llama-3-8B-Instruct/vocab.json [2024-04-18 15:59:56] INFO gen_config.py:201: [91mNot found[0m tokenizer config: /models/Meta-Llama-3-8B-Instruct/merges.txt [2024-04-18 15:59:56] INFO gen_config.py:201: [91mNot found[0m tokenizer config: /models/Meta-Llama-3-8B-Instruct/added_tokens.json [2024-04-18 15:59:56] INFO gen_config.py:199: [92mFound[0m tokenizer config: /models/Meta-Llama-3-8B-Instruct/tokenizer_config.json. Copying to [1m/tmp/tmpq8el2iww/tokenizer_config.json[0m [2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mpad_token_id[0m: 0 [2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mtemperature[0m: 0.7 [2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mpresence_penalty[0m: 0.0 [2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mfrequency_penalty[0m: 0.0 [2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mrepetition_penalty[0m: 1.0 [2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mtop_p[0m: 0.95 [2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mmean_gen_len[0m: 128 [2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mmax_gen_len[0m: 512 [2024-04-18 15:59:56] INFO gen_config.py:76: [System default] Setting [1mshift_fill_factor[0m: 0.3 [2024-04-18 15:59:56] INFO gen_config.py:263: Dumping configuration file to: [1m/tmp/tmpq8el2iww/mlc-chat-config.json[0m /home/cfruan/.conda/envs/mlc-source-311/bin/python -m mlc_llm convert_weight /models/Meta-Llama-3-8B-Instruct --quantization q0f32 --source-format auto --output /tmp/tmpq8el2iww [2024-04-18 15:59:57] INFO auto_config.py:115: [92mFound[0m model configuration: /models/Meta-Llama-3-8B-Instruct/config.json [2024-04-18 15:59:58] INFO auto_device.py:76: [92mFound[0m device: cuda:0 [2024-04-18 15:59:58] INFO auto_device.py:76: [92mFound[0m device: cuda:1 [2024-04-18 15:59:59] INFO auto_device.py:85: [91mNot found[0m device: rocm:0 [2024-04-18 16:00:00] INFO auto_device.py:85: [91mNot found[0m device: metal:0 [2024-04-18 16:00:01] INFO auto_device.py:76: [92mFound[0m device: vulkan:0 [2024-04-18 16:00:01] INFO auto_device.py:76: [92mFound[0m device: vulkan:1 [2024-04-18 16:00:01] INFO auto_device.py:76: [92mFound[0m device: vulkan:2 [2024-04-18 16:00:02] INFO auto_device.py:85: [91mNot found[0m device: opencl:0 [2024-04-18 16:00:02] INFO auto_device.py:33: Using device: [1mcuda:0[0m [2024-04-18 16:00:02] INFO auto_weight.py:70: Finding weights in: /models/Meta-Llama-3-8B-Instruct [2024-04-18 16:00:02] INFO auto_weight.py:136: [91mNot found[0m Huggingface PyTorch [2024-04-18 16:00:02] INFO auto_weight.py:143: [92mFound[0m source weight format: huggingface-safetensor. Source configuration: /models/Meta-Llama-3-8B-Instruct/model.safetensors.index.json [2024-04-18 16:00:02] INFO auto_weight.py:106: Using source weight configuration: [1m/models/Meta-Llama-3-8B-Instruct/model.safetensors.index.json[0m. Use `--source` to override. [2024-04-18 16:00:02] INFO auto_weight.py:110: Using source weight format: [1mhuggingface-safetensor[0m. Use `--source-format` to override. [2024-04-18 16:00:02] INFO auto_config.py:153: [92mFound[0m model type: [1mllama[0m. Use `--model-type` to override. [2024-04-18 16:00:02] INFO llama_model.py:52: [1mcontext_window_size[0m not found in config.json. Falling back to [1mmax_position_embeddings[0m (8192) [2024-04-18 16:00:02] INFO llama_model.py:72: [1mprefill_chunk_size[0m defaults to [1mcontext_window_size[0m (8192) [1mWeight conversion with arguments:[0m [1m--config[0m /models/Meta-Llama-3-8B-Instruct/config.json [1m--quantization[0m NoQuantize(name='q0f32', kind='no-quant', model_dtype='float32') [1m--model-type[0m llama [1m--device[0m cuda:0 [1m--source[0m /models/Meta-Llama-3-8B-Instruct/model.safetensors.index.json [1m--source-format[0m huggingface-safetensor [1m--output[0m /tmp/tmpq8el2iww Start storing to cache /tmp/tmpq8el2iww 0%| | 0/195 [00:00<?, ?it/s] [2024-04-18 16:00:06] INFO huggingface_loader.py:184: Loading HF parameters from: /models/Meta-Llama-3-8B-Instruct/model-00004-of-00004.safetensors 0%| | 0/195 [00:00<?, ?it/s] [2024-04-18 16:00:16] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mlm_head.weight[0m", shape: (128256, 4096), dtype: float32 0%| | 0/195 [00:09<?, ?it/s]/home/cfruan/.conda/envs/mlc-source-311/lib/python3.11/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/cfruan/.conda/envs/mlc-source-311/lib/python3.11/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero. return self._float_to_str(self.smallest_subnormal) 1%|β | 1/195 [00:25<1:22:17, 25.45s/it] [2024-04-18 16:00:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.31.input_layernorm.weight[0m", shape: (4096,), dtype: float32 1%|β | 1/195 [00:25<1:22:17, 25.45s/it] 1%|ββ | 2/195 [00:25<33:57, 10.56s/it] [2024-04-18 16:00:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.31.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 1%|ββ | 2/195 [00:25<33:57, 10.56s/it] 2%|βββ | 3/195 [00:26<19:22, 6.05s/it] [2024-04-18 16:00:33] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.31.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 2%|βββ | 3/195 [00:26<19:22, 6.05s/it] [2024-04-18 16:00:33] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.norm.weight[0m", shape: (4096,), dtype: float32 2%|βββ | 3/195 [00:26<19:22, 6.05s/it] [2024-04-18 16:00:33] INFO huggingface_loader.py:196: Unloading HF weight file: /models/Meta-Llama-3-8B-Instruct/model-00004-of-00004.safetensors 2%|βββ | 3/195 [00:26<19:22, 6.05s/it] [2024-04-18 16:00:33] INFO huggingface_loader.py:184: Loading HF parameters from: /models/Meta-Llama-3-8B-Instruct/model-00001-of-00004.safetensors 2%|βββ | 3/195 [00:26<19:22, 6.05s/it] [2024-04-18 16:00:45] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.embed_tokens.weight[0m", shape: (128256, 4096), dtype: float32 2%|βββ | 3/195 [00:38<19:22, 6.05s/it] 3%|βββββ | 6/195 [00:59<29:18, 9.30s/it] [2024-04-18 16:01:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.0.input_layernorm.weight[0m", shape: (4096,), dtype: float32 3%|βββββ | 6/195 [00:59<29:18, 9.30s/it] 4%|ββββββ | 7/195 [00:59<22:20, 7.13s/it] [2024-04-18 16:01:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.0.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 4%|ββββββ | 7/195 [00:59<22:20, 7.13s/it] 4%|βββββββ | 8/195 [01:00<17:06, 5.49s/it] [2024-04-18 16:01:07] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.0.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 4%|βββββββ | 8/195 [01:00<17:06, 5.49s/it] 5%|βββββββ | 9/195 [01:02<14:08, 4.56s/it] [2024-04-18 16:01:08] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.0.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 5%|βββββββ | 9/195 [01:02<14:08, 4.56s/it] [2024-04-18 16:01:09] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.0.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 5%|βββββββ | 9/195 [01:02<14:08, 4.56s/it] 6%|βββββββββ | 11/195 [01:02<08:10, 2.66s/it] [2024-04-18 16:01:09] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.0.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 6%|βββββββββ | 11/195 [01:02<08:10, 2.66s/it] 6%|ββββββββββ | 12/195 [01:02<06:21, 2.08s/it] [2024-04-18 16:01:09] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.1.input_layernorm.weight[0m", shape: (4096,), dtype: float32 6%|ββββββββββ | 12/195 [01:02<06:21, 2.08s/it] [2024-04-18 16:01:09] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.1.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 6%|ββββββββββ | 12/195 [01:02<06:21, 2.08s/it] 7%|βββββββββββ | 14/195 [01:03<04:10, 1.39s/it] [2024-04-18 16:01:10] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.1.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 7%|βββββββββββ | 14/195 [01:03<04:10, 1.39s/it] 8%|ββββββββββββ | 15/195 [01:04<04:13, 1.41s/it] [2024-04-18 16:01:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.1.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 8%|ββββββββββββ | 15/195 [01:04<04:13, 1.41s/it] [2024-04-18 16:01:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.1.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 8%|ββββββββββββ | 15/195 [01:04<04:13, 1.41s/it] 9%|ββββββββββββββ | 17/195 [01:05<02:45, 1.08it/s] [2024-04-18 16:01:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.1.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 9%|ββββββββββββββ | 17/195 [01:05<02:45, 1.08it/s] 9%|ββββββββββββββ | 18/195 [01:05<02:16, 1.30it/s] [2024-04-18 16:01:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.2.input_layernorm.weight[0m", shape: (4096,), dtype: float32 9%|ββββββββββββββ | 18/195 [01:05<02:16, 1.30it/s] [2024-04-18 16:01:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.2.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 9%|ββββββββββββββ | 18/195 [01:05<02:16, 1.30it/s] 10%|ββββββββββββββββ | 20/195 [01:06<01:47, 1.63it/s] [2024-04-18 16:01:13] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.2.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 10%|ββββββββββββββββ | 20/195 [01:06<01:47, 1.63it/s] 11%|βββββββββββββββββ | 21/195 [01:07<02:20, 1.24it/s] [2024-04-18 16:01:14] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.2.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 11%|βββββββββββββββββ | 21/195 [01:07<02:20, 1.24it/s] [2024-04-18 16:01:14] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.2.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 11%|βββββββββββββββββ | 21/195 [01:07<02:20, 1.24it/s] 12%|ββββββββββββββββββ | 23/195 [01:08<01:37, 1.76it/s] [2024-04-18 16:01:14] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.2.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 12%|ββββββββββββββββββ | 23/195 [01:08<01:37, 1.76it/s] 12%|βββββββββββββββββββ | 24/195 [01:08<01:24, 2.04it/s] [2024-04-18 16:01:15] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.3.input_layernorm.weight[0m", shape: (4096,), dtype: float32 12%|βββββββββββββββββββ | 24/195 [01:08<01:24, 2.04it/s] [2024-04-18 16:01:15] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.3.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 12%|βββββββββββββββββββ | 24/195 [01:08<01:24, 2.04it/s] 13%|βββββββββββββββββββββ | 26/195 [01:08<01:14, 2.27it/s] [2024-04-18 16:01:15] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.3.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 13%|βββββββββββββββββββββ | 26/195 [01:09<01:14, 2.27it/s] 14%|βββββββββββββββββββββ | 27/195 [01:10<01:50, 1.51it/s] [2024-04-18 16:01:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.3.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 14%|βββββββββββββββββββββ | 27/195 [01:10<01:50, 1.51it/s] [2024-04-18 16:01:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.3.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 14%|βββββββββββββββββββββ | 27/195 [01:10<01:50, 1.51it/s] 15%|βββββββββββββββββββββββ | 29/195 [01:10<01:19, 2.09it/s] [2024-04-18 16:01:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.3.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 15%|βββββββββββββββββββββββ | 29/195 [01:10<01:19, 2.09it/s] 15%|ββββββββββββββββββββββββ | 30/195 [01:10<01:09, 2.36it/s] [2024-04-18 16:01:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.4.input_layernorm.weight[0m", shape: (4096,), dtype: float32 15%|ββββββββββββββββββββββββ | 30/195 [01:10<01:09, 2.36it/s] [2024-04-18 16:01:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.4.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 15%|ββββββββββββββββββββββββ | 30/195 [01:11<01:09, 2.36it/s] 16%|βββββββββββββββββββββββββ | 32/195 [01:11<01:04, 2.51it/s] [2024-04-18 16:01:18] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.4.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 16%|βββββββββββββββββββββββββ | 32/195 [01:11<01:04, 2.51it/s] 17%|ββββββββββββββββββββββββββ | 33/195 [01:13<01:40, 1.61it/s] [2024-04-18 16:01:19] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.4.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 17%|ββββββββββββββββββββββββββ | 33/195 [01:13<01:40, 1.61it/s] [2024-04-18 16:01:20] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.4.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 17%|ββββββββββββββββββββββββββ | 33/195 [01:13<01:40, 1.61it/s] 18%|βββββββββββββββββββββββββββ | 35/195 [01:13<01:12, 2.20it/s] [2024-04-18 16:01:20] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.4.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 18%|βββββββββββββββββββββββββββ | 35/195 [01:13<01:12, 2.20it/s] 18%|ββββββββββββββββββββββββββββ | 36/195 [01:13<01:04, 2.48it/s] [2024-04-18 16:01:20] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.5.input_layernorm.weight[0m", shape: (4096,), dtype: float32 18%|ββββββββββββββββββββββββββββ | 36/195 [01:13<01:04, 2.48it/s] [2024-04-18 16:01:20] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.5.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 18%|ββββββββββββββββββββββββββββ | 36/195 [01:13<01:04, 2.48it/s] 19%|ββββββββββββββββββββββββββββββ | 38/195 [01:14<01:00, 2.61it/s] [2024-04-18 16:01:21] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.5.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 19%|ββββββββββββββββββββββββββββββ | 38/195 [01:14<01:00, 2.61it/s] 20%|βββββββββββββββββββββββββββββββ | 39/195 [01:15<01:34, 1.65it/s] [2024-04-18 16:01:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.5.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 20%|βββββββββββββββββββββββββββββββ | 39/195 [01:15<01:34, 1.65it/s] [2024-04-18 16:01:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.5.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 20%|βββββββββββββββββββββββββββββββ | 39/195 [01:15<01:34, 1.65it/s] 21%|ββββββββββββββββββββββββββββββββ | 41/195 [01:16<01:08, 2.26it/s] [2024-04-18 16:01:23] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.5.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 21%|ββββββββββββββββββββββββββββββββ | 41/195 [01:16<01:08, 2.26it/s] 22%|βββββββββββββββββββββββββββββββββ | 42/195 [01:16<01:00, 2.55it/s] [2024-04-18 16:01:23] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.6.input_layernorm.weight[0m", shape: (4096,), dtype: float32 22%|βββββββββββββββββββββββββββββββββ | 42/195 [01:16<01:00, 2.55it/s] [2024-04-18 16:01:23] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.6.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 22%|βββββββββββββββββββββββββββββββββ | 42/195 [01:16<01:00, 2.55it/s] 23%|ββββββββββββββββββββββββββββββββββ | 44/195 [01:17<00:56, 2.68it/s] [2024-04-18 16:01:24] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.6.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 23%|ββββββββββββββββββββββββββββββββββ | 44/195 [01:17<00:56, 2.68it/s] 23%|βββββββββββββββββββββββββββββββββββ | 45/195 [01:18<01:28, 1.69it/s] [2024-04-18 16:01:25] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.6.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 23%|βββββββββββββββββββββββββββββββββββ | 45/195 [01:18<01:28, 1.69it/s] [2024-04-18 16:01:25] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.6.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 23%|βββββββββββββββββββββββββββββββββββ | 45/195 [01:18<01:28, 1.69it/s] 24%|βββββββββββββββββββββββββββββββββββββ | 47/195 [01:18<01:03, 2.32it/s] [2024-04-18 16:01:25] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.6.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 24%|βββββββββββββββββββββββββββββββββββββ | 47/195 [01:18<01:03, 2.32it/s] 25%|ββββββββββββββββββββββββββββββββββββββ | 48/195 [01:19<00:56, 2.62it/s] [2024-04-18 16:01:25] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.7.input_layernorm.weight[0m", shape: (4096,), dtype: float32 25%|ββββββββββββββββββββββββββββββββββββββ | 48/195 [01:19<00:56, 2.62it/s] [2024-04-18 16:01:25] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.7.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 25%|ββββββββββββββββββββββββββββββββββββββ | 48/195 [01:19<00:56, 2.62it/s] 26%|βββββββββββββββββββββββββββββββββββββββ | 50/195 [01:19<00:52, 2.74it/s] [2024-04-18 16:01:26] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.7.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 26%|βββββββββββββββββββββββββββββββββββββββ | 50/195 [01:19<00:52, 2.74it/s] 26%|ββββββββββββββββββββββββββββββββββββββββ | 51/195 [01:21<01:24, 1.71it/s] [2024-04-18 16:01:27] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.7.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 26%|ββββββββββββββββββββββββββββββββββββββββ | 51/195 [01:21<01:24, 1.71it/s] [2024-04-18 16:01:27] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.7.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 26%|ββββββββββββββββββββββββββββββββββββββββ | 51/195 [01:21<01:24, 1.71it/s] 27%|βββββββββββββββββββββββββββββββββββββββββ | 53/195 [01:21<01:00, 2.35it/s] [2024-04-18 16:01:28] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.7.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 27%|βββββββββββββββββββββββββββββββββββββββββ | 53/195 [01:21<01:00, 2.35it/s] 28%|ββββββββββββββββββββββββββββββββββββββββββ | 54/195 [01:21<00:53, 2.65it/s] [2024-04-18 16:01:28] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.8.input_layernorm.weight[0m", shape: (4096,), dtype: float32 28%|ββββββββββββββββββββββββββββββββββββββββββ | 54/195 [01:21<00:53, 2.65it/s] [2024-04-18 16:01:28] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.8.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 28%|ββββββββββββββββββββββββββββββββββββββββββ | 54/195 [01:21<00:53, 2.65it/s] 29%|ββββββββββββββββββββββββββββββββββββββββββββ | 56/195 [01:22<00:49, 2.78it/s] [2024-04-18 16:01:29] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.8.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 29%|ββββββββββββββββββββββββββββββββββββββββββββ | 56/195 [01:22<00:49, 2.78it/s] 29%|βββββββββββββββββββββββββββββββββββββββββββββ | 57/195 [01:23<01:19, 1.74it/s] [2024-04-18 16:01:30] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.8.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 29%|βββββββββββββββββββββββββββββββββββββββββββββ | 57/195 [01:23<01:19, 1.74it/s] [2024-04-18 16:01:30] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.8.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 29%|βββββββββββββββββββββββββββββββββββββββββββββ | 57/195 [01:23<01:19, 1.74it/s] 30%|ββββββββββββββββββββββββββββββββββββββββββββββ | 59/195 [01:23<00:56, 2.39it/s] [2024-04-18 16:01:30] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.8.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 30%|ββββββββββββββββββββββββββββββββββββββββββββββ | 59/195 [01:23<00:56, 2.39it/s] 31%|βββββββββββββββββββββββββββββββββββββββββββββββ | 60/195 [01:24<00:50, 2.70it/s] [2024-04-18 16:01:31] INFO huggingface_loader.py:196: Unloading HF weight file: /models/Meta-Llama-3-8B-Instruct/model-00001-of-00004.safetensors 31%|βββββββββββββββββββββββββββββββββββββββββββββββ | 60/195 [01:24<00:50, 2.70it/s] [2024-04-18 16:01:31] INFO huggingface_loader.py:184: Loading HF parameters from: /models/Meta-Llama-3-8B-Instruct/model-00002-of-00004.safetensors 31%|βββββββββββββββββββββββββββββββββββββββββββββββ | 60/195 [01:24<00:50, 2.70it/s] [2024-04-18 16:01:42] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.10.input_layernorm.weight[0m", shape: (4096,), dtype: float32 31%|βββββββββββββββββββββββββββββββββββββββββββββββ | 60/195 [01:35<00:50, 2.70it/s] 31%|ββββββββββββββββββββββββββββββββββββββββββββββββ | 61/195 [01:35<06:37, 2.97s/it] [2024-04-18 16:01:42] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.10.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 31%|ββββββββββββββββββββββββββββββββββββββββββββββββ | 61/195 [01:35<06:37, 2.97s/it] 32%|ββββββββββββββββββββββββββββββββββββββββββββββββ | 62/195 [01:36<05:17, 2.39s/it] [2024-04-18 16:01:43] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.10.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 32%|ββββββββββββββββββββββββββββββββββββββββββββββββ | 62/195 [01:36<05:17, 2.39s/it] 32%|βββββββββββββββββββββββββββββββββββββββββββββββββ | 63/195 [01:37<04:44, 2.16s/it] [2024-04-18 16:01:44] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.10.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 32%|βββββββββββββββββββββββββββββββββββββββββββββββββ | 63/195 [01:37<04:44, 2.16s/it] [2024-04-18 16:01:44] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.10.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 32%|βββββββββββββββββββββββββββββββββββββββββββββββββ | 63/195 [01:37<04:44, 2.16s/it] 33%|βββββββββββββββββββββββββββββββββββββββββββββββββββ | 65/195 [01:38<02:48, 1.30s/it] [2024-04-18 16:01:45] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.10.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 33%|βββββββββββββββββββββββββββββββββββββββββββββββββββ | 65/195 [01:38<02:48, 1.30s/it] 34%|βββββββββββββββββββββββββββββββββββββββββββββββββββ | 66/195 [01:38<02:14, 1.04s/it] [2024-04-18 16:01:45] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.11.input_layernorm.weight[0m", shape: (4096,), dtype: float32 34%|βββββββββββββββββββββββββββββββββββββββββββββββββββ | 66/195 [01:38<02:14, 1.04s/it] [2024-04-18 16:01:45] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.11.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 34%|βββββββββββββββββββββββββββββββββββββββββββββββββββ | 66/195 [01:38<02:14, 1.04s/it] 35%|βββββββββββββββββββββββββββββββββββββββββββββββββββββ | 68/195 [01:39<01:35, 1.32it/s] [2024-04-18 16:01:46] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.11.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 35%|βββββββββββββββββββββββββββββββββββββββββββββββββββββ | 68/195 [01:39<01:35, 1.32it/s] 35%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 69/195 [01:41<02:13, 1.06s/it] [2024-04-18 16:01:47] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.11.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 35%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 69/195 [01:41<02:13, 1.06s/it] [2024-04-18 16:01:48] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.11.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 35%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 69/195 [01:41<02:13, 1.06s/it] 36%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 71/195 [01:41<01:27, 1.41it/s] [2024-04-18 16:01:48] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.11.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 36%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 71/195 [01:41<01:27, 1.41it/s] 37%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 72/195 [01:41<01:13, 1.67it/s] [2024-04-18 16:01:48] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.12.input_layernorm.weight[0m", shape: (4096,), dtype: float32 37%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 72/195 [01:41<01:13, 1.67it/s] [2024-04-18 16:01:48] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.12.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 37%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 72/195 [01:41<01:13, 1.67it/s] 38%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 74/195 [01:42<00:59, 2.02it/s] [2024-04-18 16:01:49] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.12.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 38%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 74/195 [01:42<00:59, 2.02it/s] 38%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 75/195 [01:45<02:01, 1.02s/it] [2024-04-18 16:01:52] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.12.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 38%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 75/195 [01:45<02:01, 1.02s/it] [2024-04-18 16:01:52] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.12.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 38%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 75/195 [01:45<02:01, 1.02s/it] 39%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 77/195 [01:45<01:21, 1.44it/s] [2024-04-18 16:01:52] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.12.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 39%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 77/195 [01:45<01:21, 1.44it/s] 40%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 78/195 [01:45<01:08, 1.70it/s] [2024-04-18 16:01:52] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.13.input_layernorm.weight[0m", shape: (4096,), dtype: float32 40%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 78/195 [01:45<01:08, 1.70it/s] [2024-04-18 16:01:52] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.13.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 40%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 78/195 [01:45<01:08, 1.70it/s] 41%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 80/195 [01:46<00:57, 1.99it/s] [2024-04-18 16:01:53] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.13.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 41%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 80/195 [01:47<00:57, 1.99it/s] 42%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 81/195 [01:49<02:10, 1.14s/it] [2024-04-18 16:01:56] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.13.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 42%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 81/195 [01:49<02:10, 1.14s/it] [2024-04-18 16:01:56] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.13.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 42%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 81/195 [01:50<02:10, 1.14s/it] 43%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 83/195 [01:50<01:26, 1.29it/s] [2024-04-18 16:01:57] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.13.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 43%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 83/195 [01:50<01:26, 1.29it/s] 43%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 84/195 [01:50<01:12, 1.53it/s] [2024-04-18 16:01:57] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.14.input_layernorm.weight[0m", shape: (4096,), dtype: float32 43%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 84/195 [01:50<01:12, 1.53it/s] [2024-04-18 16:01:57] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.14.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 43%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 84/195 [01:50<01:12, 1.53it/s] 44%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 86/195 [01:51<01:04, 1.69it/s] [2024-04-18 16:01:58] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.14.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 44%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 86/195 [01:52<01:04, 1.69it/s] 45%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 87/195 [01:54<02:06, 1.18s/it] [2024-04-18 16:02:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.14.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 45%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 87/195 [01:54<02:06, 1.18s/it] [2024-04-18 16:02:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.14.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 45%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 87/195 [01:54<02:06, 1.18s/it] 46%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 89/195 [01:55<01:24, 1.25it/s] [2024-04-18 16:02:02] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.14.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 46%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 89/195 [01:55<01:24, 1.25it/s] 46%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 90/195 [01:55<01:10, 1.49it/s] [2024-04-18 16:02:02] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.15.input_layernorm.weight[0m", shape: (4096,), dtype: float32 46%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 90/195 [01:55<01:10, 1.49it/s] [2024-04-18 16:02:02] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.15.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 46%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 90/195 [01:55<01:10, 1.49it/s] 47%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 92/195 [01:56<01:01, 1.67it/s] [2024-04-18 16:02:03] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.15.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 47%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 92/195 [01:56<01:01, 1.67it/s] 48%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 93/195 [01:59<01:59, 1.17s/it] [2024-04-18 16:02:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.15.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 48%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 93/195 [01:59<01:59, 1.17s/it] [2024-04-18 16:02:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.15.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 48%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 93/195 [01:59<01:59, 1.17s/it] 49%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 95/195 [02:00<01:19, 1.26it/s] [2024-04-18 16:02:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.15.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 49%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 95/195 [02:00<01:19, 1.26it/s] 49%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 96/195 [02:00<01:06, 1.49it/s] [2024-04-18 16:02:07] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.16.input_layernorm.weight[0m", shape: (4096,), dtype: float32 49%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 96/195 [02:00<01:06, 1.49it/s] [2024-04-18 16:02:07] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.16.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 49%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 96/195 [02:00<01:06, 1.49it/s] 50%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 98/195 [02:01<01:01, 1.58it/s] [2024-04-18 16:02:08] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.16.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 50%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 98/195 [02:01<01:01, 1.58it/s] 51%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 99/195 [02:04<01:53, 1.18s/it] [2024-04-18 16:02:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.16.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 51%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 99/195 [02:04<01:53, 1.18s/it] [2024-04-18 16:02:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.16.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 51%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 99/195 [02:04<01:53, 1.18s/it] 52%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 101/195 [02:04<01:15, 1.24it/s] [2024-04-18 16:02:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.16.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 52%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 101/195 [02:04<01:15, 1.24it/s] 52%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 102/195 [02:05<01:03, 1.47it/s] [2024-04-18 16:02:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.17.input_layernorm.weight[0m", shape: (4096,), dtype: float32 52%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 102/195 [02:05<01:03, 1.47it/s] [2024-04-18 16:02:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.17.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 52%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 102/195 [02:05<01:03, 1.47it/s] 53%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 104/195 [02:06<00:59, 1.54it/s] [2024-04-18 16:02:13] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.17.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 53%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 104/195 [02:06<00:59, 1.54it/s] 54%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 105/195 [02:09<01:48, 1.20s/it] [2024-04-18 16:02:16] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.17.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 54%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 105/195 [02:09<01:48, 1.20s/it] [2024-04-18 16:02:16] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.17.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 54%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 105/195 [02:09<01:48, 1.20s/it] 55%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 107/195 [02:09<01:11, 1.23it/s] [2024-04-18 16:02:16] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.17.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 55%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 107/195 [02:09<01:11, 1.23it/s] 55%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 108/195 [02:10<00:59, 1.46it/s] [2024-04-18 16:02:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.18.input_layernorm.weight[0m", shape: (4096,), dtype: float32 55%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 108/195 [02:10<00:59, 1.46it/s] [2024-04-18 16:02:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.18.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 55%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 108/195 [02:10<00:59, 1.46it/s] 56%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 110/195 [02:11<00:53, 1.58it/s] [2024-04-18 16:02:18] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.18.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 56%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 110/195 [02:11<00:53, 1.58it/s] 57%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 111/195 [02:14<01:37, 1.17s/it] [2024-04-18 16:02:21] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.18.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 57%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 111/195 [02:14<01:37, 1.17s/it] [2024-04-18 16:02:21] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.18.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 57%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 111/195 [02:14<01:37, 1.17s/it] 58%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 113/195 [02:14<01:04, 1.27it/s] [2024-04-18 16:02:21] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.18.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 58%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 113/195 [02:14<01:04, 1.27it/s] 58%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 114/195 [02:14<00:53, 1.50it/s] [2024-04-18 16:02:21] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.19.input_layernorm.weight[0m", shape: (4096,), dtype: float32 58%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 114/195 [02:14<00:53, 1.50it/s] [2024-04-18 16:02:21] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.19.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 58%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 114/195 [02:15<00:53, 1.50it/s] 59%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 116/195 [02:16<00:48, 1.61it/s] [2024-04-18 16:02:23] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.19.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 59%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 116/195 [02:16<00:48, 1.61it/s] 60%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 117/195 [02:19<01:33, 1.19s/it] [2024-04-18 16:02:26] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.19.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 60%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 117/195 [02:19<01:33, 1.19s/it] [2024-04-18 16:02:26] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.19.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 60%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 117/195 [02:19<01:33, 1.19s/it] 61%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 119/195 [02:19<01:01, 1.24it/s] [2024-04-18 16:02:26] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.19.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 61%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 119/195 [02:19<01:01, 1.24it/s] 62%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 120/195 [02:19<00:51, 1.46it/s] [2024-04-18 16:02:26] INFO huggingface_loader.py:184: Loading HF parameters from: /models/Meta-Llama-3-8B-Instruct/model-00003-of-00004.safetensors 62%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 120/195 [02:19<00:51, 1.46it/s] [2024-04-18 16:02:38] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.20.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 62%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 120/195 [02:31<00:51, 1.46it/s] 62%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 121/195 [02:34<04:56, 4.01s/it] [2024-04-18 16:02:41] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.20.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 62%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 121/195 [02:35<04:56, 4.01s/it] 63%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 122/195 [02:35<03:59, 3.28s/it] [2024-04-18 16:02:42] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.20.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 63%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 122/195 [02:36<03:59, 3.28s/it] 63%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 123/195 [02:36<03:02, 2.53s/it] [2024-04-18 16:02:43] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.9.input_layernorm.weight[0m", shape: (4096,), dtype: float32 63%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 123/195 [02:36<03:02, 2.53s/it] [2024-04-18 16:02:43] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.9.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 63%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 123/195 [02:36<03:02, 2.53s/it] 64%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 125/195 [02:38<02:05, 1.79s/it] [2024-04-18 16:02:45] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.9.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 64%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 125/195 [02:38<02:05, 1.79s/it] 65%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 126/195 [02:41<02:33, 2.22s/it] [2024-04-18 16:02:48] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.9.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 65%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 126/195 [02:41<02:33, 2.22s/it] [2024-04-18 16:02:48] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.9.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 65%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 126/195 [02:41<02:33, 2.22s/it] 66%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 128/195 [02:42<01:38, 1.47s/it] [2024-04-18 16:02:49] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.9.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 66%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 128/195 [02:42<01:38, 1.47s/it] 66%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 129/195 [02:42<01:18, 1.19s/it] [2024-04-18 16:02:49] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.20.input_layernorm.weight[0m", shape: (4096,), dtype: float32 66%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 129/195 [02:42<01:18, 1.19s/it] [2024-04-18 16:02:49] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.20.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 66%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 129/195 [02:42<01:18, 1.19s/it] 67%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 131/195 [02:44<01:06, 1.04s/it] [2024-04-18 16:02:51] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.20.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 67%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 131/195 [02:44<01:06, 1.04s/it] [2024-04-18 16:02:51] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.21.input_layernorm.weight[0m", shape: (4096,), dtype: float32 67%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 131/195 [02:44<01:06, 1.04s/it] [2024-04-18 16:02:51] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.21.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 67%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 131/195 [02:44<01:06, 1.04s/it] 69%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 134/195 [02:45<00:47, 1.28it/s] [2024-04-18 16:02:53] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.21.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 69%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 134/195 [02:46<00:47, 1.28it/s] 69%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 135/195 [02:49<01:15, 1.26s/it] [2024-04-18 16:02:55] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.21.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 69%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 135/195 [02:49<01:15, 1.26s/it] [2024-04-18 16:02:55] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.21.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 69%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 135/195 [02:49<01:15, 1.26s/it] 70%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 137/195 [02:49<00:52, 1.11it/s] [2024-04-18 16:02:56] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.21.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 70%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 137/195 [02:49<00:52, 1.11it/s] 71%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 138/195 [02:49<00:43, 1.30it/s] [2024-04-18 16:02:56] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.22.input_layernorm.weight[0m", shape: (4096,), dtype: float32 71%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 138/195 [02:49<00:43, 1.30it/s] [2024-04-18 16:02:56] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.22.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 71%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 138/195 [02:49<00:43, 1.30it/s] 72%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 140/195 [02:50<00:38, 1.42it/s] [2024-04-18 16:02:58] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.22.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 72%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 140/195 [02:51<00:38, 1.42it/s] 72%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 141/195 [02:54<01:09, 1.29s/it] [2024-04-18 16:03:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.22.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 72%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 141/195 [02:54<01:09, 1.29s/it] [2024-04-18 16:03:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.22.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 72%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 141/195 [02:54<01:09, 1.29s/it] 73%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 143/195 [02:54<00:46, 1.12it/s] [2024-04-18 16:03:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.22.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 73%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 143/195 [02:54<00:46, 1.12it/s] 74%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 144/195 [02:55<00:38, 1.33it/s] [2024-04-18 16:03:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.23.input_layernorm.weight[0m", shape: (4096,), dtype: float32 74%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 144/195 [02:55<00:38, 1.33it/s] [2024-04-18 16:03:01] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.23.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 74%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 144/195 [02:55<00:38, 1.33it/s] 75%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 146/195 [02:56<00:34, 1.41it/s] [2024-04-18 16:03:03] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.23.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 75%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 146/195 [02:57<00:34, 1.41it/s] 75%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 147/195 [02:59<01:03, 1.32s/it] [2024-04-18 16:03:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.23.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 75%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 147/195 [02:59<01:03, 1.32s/it] [2024-04-18 16:03:06] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.23.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 75%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 147/195 [03:00<01:03, 1.32s/it] 76%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 149/195 [03:00<00:41, 1.12it/s] [2024-04-18 16:03:07] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.23.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 76%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 149/195 [03:00<00:41, 1.12it/s] 77%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 150/195 [03:00<00:33, 1.33it/s] [2024-04-18 16:03:07] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.24.input_layernorm.weight[0m", shape: (4096,), dtype: float32 77%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 150/195 [03:00<00:33, 1.33it/s] [2024-04-18 16:03:07] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.24.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 77%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 150/195 [03:00<00:33, 1.33it/s] 78%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 152/195 [03:01<00:30, 1.40it/s] [2024-04-18 16:03:09] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.24.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 78%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 152/195 [03:02<00:30, 1.40it/s] 78%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 153/195 [03:05<00:52, 1.25s/it] [2024-04-18 16:03:11] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.24.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 78%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 153/195 [03:05<00:52, 1.25s/it] [2024-04-18 16:03:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.24.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 78%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 153/195 [03:05<00:52, 1.25s/it] 79%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 155/195 [03:05<00:33, 1.18it/s] [2024-04-18 16:03:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.24.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 79%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 155/195 [03:05<00:33, 1.18it/s] 80%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 156/195 [03:05<00:27, 1.40it/s] [2024-04-18 16:03:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.25.input_layernorm.weight[0m", shape: (4096,), dtype: float32 80%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 156/195 [03:05<00:27, 1.40it/s] [2024-04-18 16:03:12] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.25.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 80%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 156/195 [03:05<00:27, 1.40it/s] 81%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 158/195 [03:06<00:24, 1.53it/s] [2024-04-18 16:03:14] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.25.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 81%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 158/195 [03:07<00:24, 1.53it/s] 82%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 159/195 [03:10<00:44, 1.22s/it] [2024-04-18 16:03:16] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.25.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 82%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 159/195 [03:10<00:44, 1.22s/it] [2024-04-18 16:03:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.25.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 82%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 159/195 [03:10<00:44, 1.22s/it] 83%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 161/195 [03:10<00:28, 1.21it/s] [2024-04-18 16:03:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.25.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 83%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 161/195 [03:10<00:28, 1.21it/s] 83%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 162/195 [03:10<00:22, 1.44it/s] [2024-04-18 16:03:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.26.input_layernorm.weight[0m", shape: (4096,), dtype: float32 83%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 162/195 [03:10<00:22, 1.44it/s] [2024-04-18 16:03:17] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.26.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 83%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 162/195 [03:10<00:22, 1.44it/s] 84%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 164/195 [03:11<00:20, 1.52it/s] [2024-04-18 16:03:19] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.26.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 84%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 164/195 [03:12<00:20, 1.52it/s] 85%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 165/195 [03:15<00:36, 1.22s/it] [2024-04-18 16:03:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.26.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 85%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 165/195 [03:15<00:36, 1.22s/it] [2024-04-18 16:03:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.26.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 85%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 165/195 [03:15<00:36, 1.22s/it] 86%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 167/195 [03:15<00:23, 1.21it/s] [2024-04-18 16:03:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.26.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 86%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 167/195 [03:15<00:23, 1.21it/s] 86%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 168/195 [03:15<00:18, 1.44it/s] [2024-04-18 16:03:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.27.input_layernorm.weight[0m", shape: (4096,), dtype: float32 86%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 168/195 [03:15<00:18, 1.44it/s] [2024-04-18 16:03:22] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.27.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 86%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 168/195 [03:15<00:18, 1.44it/s] 87%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 170/195 [03:16<00:15, 1.61it/s] [2024-04-18 16:03:24] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.27.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 87%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 170/195 [03:17<00:15, 1.61it/s] 88%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 171/195 [03:19<00:28, 1.17s/it] [2024-04-18 16:03:26] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.27.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 88%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 171/195 [03:19<00:28, 1.17s/it] [2024-04-18 16:03:26] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.27.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 88%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 171/195 [03:20<00:28, 1.17s/it] 89%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 173/195 [03:20<00:17, 1.26it/s] [2024-04-18 16:03:27] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.27.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 89%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 173/195 [03:20<00:17, 1.26it/s] 89%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 174/195 [03:20<00:14, 1.50it/s] [2024-04-18 16:03:27] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.28.input_layernorm.weight[0m", shape: (4096,), dtype: float32 89%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 174/195 [03:20<00:14, 1.50it/s] [2024-04-18 16:03:27] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.28.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 89%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 174/195 [03:20<00:14, 1.50it/s] 90%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 176/195 [03:21<00:11, 1.64it/s] [2024-04-18 16:03:29] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.28.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 90%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 176/195 [03:22<00:11, 1.64it/s] 91%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 177/195 [03:25<00:22, 1.25s/it] [2024-04-18 16:03:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.28.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 91%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 177/195 [03:25<00:22, 1.25s/it] [2024-04-18 16:03:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.28.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 91%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 177/195 [03:25<00:22, 1.25s/it] 92%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 179/195 [03:25<00:13, 1.18it/s] [2024-04-18 16:03:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.28.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 92%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 179/195 [03:25<00:13, 1.18it/s] 92%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 180/195 [03:25<00:10, 1.41it/s] [2024-04-18 16:03:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.29.input_layernorm.weight[0m", shape: (4096,), dtype: float32 92%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 180/195 [03:25<00:10, 1.41it/s] [2024-04-18 16:03:32] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.29.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 92%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 180/195 [03:25<00:10, 1.41it/s] 93%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 182/195 [03:26<00:08, 1.58it/s] [2024-04-18 16:03:34] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.29.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 93%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 182/195 [03:27<00:08, 1.58it/s] 94%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 183/195 [03:29<00:14, 1.18s/it] [2024-04-18 16:03:36] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.29.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 94%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 183/195 [03:29<00:14, 1.18s/it] [2024-04-18 16:03:36] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.29.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 94%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 183/195 [03:30<00:14, 1.18s/it] 95%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 185/195 [03:30<00:08, 1.25it/s] [2024-04-18 16:03:37] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.29.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 95%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 185/195 [03:30<00:08, 1.25it/s] 95%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 186/195 [03:30<00:06, 1.48it/s] [2024-04-18 16:03:37] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.30.input_layernorm.weight[0m", shape: (4096,), dtype: float32 95%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 186/195 [03:30<00:06, 1.48it/s] [2024-04-18 16:03:37] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.30.mlp.down_proj.weight[0m", shape: (4096, 14336), dtype: float32 95%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 186/195 [03:30<00:06, 1.48it/s] 96%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 188/195 [03:31<00:04, 1.56it/s] [2024-04-18 16:03:39] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.30.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 96%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 188/195 [03:32<00:04, 1.56it/s] 97%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 189/195 [03:34<00:07, 1.19s/it] [2024-04-18 16:03:41] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.30.post_attention_layernorm.weight[0m", shape: (4096,), dtype: float32 97%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 189/195 [03:34<00:07, 1.19s/it] [2024-04-18 16:03:41] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.30.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 97%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 189/195 [03:34<00:07, 1.19s/it] 98%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 191/195 [03:35<00:03, 1.24it/s] [2024-04-18 16:03:42] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.30.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 98%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 191/195 [03:35<00:03, 1.24it/s] 98%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 192/195 [03:35<00:02, 1.47it/s] [2024-04-18 16:03:42] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.31.mlp.gate_up_proj.weight[0m", shape: (28672, 4096), dtype: float32 98%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 192/195 [03:36<00:02, 1.47it/s] 99%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 193/195 [03:38<00:02, 1.32s/it] [2024-04-18 16:03:45] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.31.self_attn.qkv_proj.weight[0m", shape: (6144, 4096), dtype: float32 99%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 193/195 [03:38<00:02, 1.32s/it] 99%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 194/195 [03:39<00:01, 1.09s/it] [2024-04-18 16:03:46] INFO huggingface_loader.py:174: [Not quantized] Parameter: "[1mmodel.layers.31.self_attn.o_proj.weight[0m", shape: (4096, 4096), dtype: float32 99%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 194/195 [03:39<00:01, 1.09s/it] 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 195/195 [03:39<00:00, 1.17it/s] 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 195/195 [03:39<00:00, 1.13s/it] [2024-04-18 16:03:46] INFO huggingface_loader.py:196: Unloading HF weight file: /models/Meta-Llama-3-8B-Instruct/model-00002-of-00004.safetensors [2024-04-18 16:03:46] INFO huggingface_loader.py:196: Unloading HF weight file: /models/Meta-Llama-3-8B-Instruct/model-00003-of-00004.safetensors [2024-04-18 16:03:47] INFO stats.py:76: [92mTime usage[0m: HF loading: 36.734 sec; Pre-quantization mapping: 24.043 sec; Quantization: 0.000 sec [2024-04-18 16:03:47] INFO stats.py:90: [92mRAM usage[0m: Peak RAM: 18.469 GB. Total bytes loaded from disk: 29.915 GB [2024-04-18 16:03:47] INFO convert_weight.py:156: [92mParameter size[0m after quantization: 29.915 GB [2024-04-18 16:03:47] INFO convert_weight.py:161: [92mTotal parameters[0m: 8,030,261,248 [2024-04-18 16:03:47] INFO convert_weight.py:162: [92mBits per parameter[0m: 32.000 [2024-04-18 16:03:47] INFO convert_weight.py:167: Saved to directory: [1m/tmp/tmpq8el2iww[0m All finished, 131 total shards committed, record saved to /tmp/tmpq8el2iww/ndarray-cache.json Also saved a bf16 record to /tmp/tmpq8el2iww/ndarray-cache-b16.json |