06/08/2024 23:48:32 - INFO - transformers.tokenization_utils_base - loading file vocab.json 06/08/2024 23:48:32 - INFO - transformers.tokenization_utils_base - loading file merges.txt 06/08/2024 23:48:32 - INFO - transformers.tokenization_utils_base - loading file tokenizer.json 06/08/2024 23:48:32 - INFO - transformers.tokenization_utils_base - loading file added_tokens.json 06/08/2024 23:48:32 - INFO - transformers.tokenization_utils_base - loading file special_tokens_map.json 06/08/2024 23:48:32 - INFO - transformers.tokenization_utils_base - loading file tokenizer_config.json 06/08/2024 23:48:33 - WARNING - transformers.tokenization_utils_base - Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 06/08/2024 23:48:33 - INFO - llamafactory.data.loader - Loading dataset longzu.json... 06/08/2024 23:48:34 - INFO - transformers.configuration_utils - loading configuration file C:\AI\Qwen2_0.5B\config.json 06/08/2024 23:48:34 - INFO - transformers.configuration_utils - Model config Qwen2Config { "_name_or_path": "C:\\AI\\Qwen2_0.5B", "architectures": [ "Qwen2ForCausalLM" ], "attention_dropout": 0.0, "bos_token_id": 151643, "eos_token_id": 151643, "hidden_act": "silu", "hidden_size": 896, "initializer_range": 0.02, "intermediate_size": 4864, "max_position_embeddings": 131072, "max_window_layers": 24, "model_type": "qwen2", "num_attention_heads": 14, "num_hidden_layers": 24, "num_key_value_heads": 2, "rms_norm_eps": 1e-06, "rope_theta": 1000000.0, "sliding_window": 131072, "tie_word_embeddings": true, "torch_dtype": "bfloat16", "transformers_version": "4.41.2", "use_cache": true, "use_sliding_window": false, "vocab_size": 151936 } 06/08/2024 23:48:34 - INFO - transformers.modeling_utils - loading weights file C:\AI\Qwen2_0.5B\model.safetensors 06/08/2024 23:48:34 - INFO - transformers.modeling_utils - Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16. 06/08/2024 23:48:34 - INFO - transformers.generation.configuration_utils - Generate config GenerationConfig { "bos_token_id": 151643, "eos_token_id": 151643 } 06/08/2024 23:48:36 - INFO - transformers.modeling_utils - All model checkpoint weights were used when initializing Qwen2ForCausalLM. 06/08/2024 23:48:36 - INFO - transformers.modeling_utils - All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at C:\AI\Qwen2_0.5B. If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training. 06/08/2024 23:48:36 - INFO - transformers.generation.configuration_utils - loading configuration file C:\AI\Qwen2_0.5B\generation_config.json 06/08/2024 23:48:36 - INFO - transformers.generation.configuration_utils - Generate config GenerationConfig { "bos_token_id": 151643, "eos_token_id": 151643, "max_new_tokens": 2048 } 06/08/2024 23:48:37 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled. 06/08/2024 23:48:37 - INFO - llamafactory.model.model_utils.attention - Using torch SDPA for faster training and inference. 06/08/2024 23:48:37 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32. 06/08/2024 23:48:37 - INFO - llamafactory.model.adapter - Fine-tuning method: Full 06/08/2024 23:48:37 - INFO - llamafactory.model.loader - trainable params: 494032768 || all params: 494032768 || trainable%: 100.0000 06/08/2024 23:48:37 - INFO - transformers.trainer - ***** Running training ***** 06/08/2024 23:48:37 - INFO - transformers.trainer - Num examples = 345 06/08/2024 23:48:37 - INFO - transformers.trainer - Num Epochs = 35 06/08/2024 23:48:37 - INFO - transformers.trainer - Instantaneous batch size per device = 1 06/08/2024 23:48:37 - INFO - transformers.trainer - Total train batch size (w. parallel, distributed & accumulation) = 8 06/08/2024 23:48:37 - INFO - transformers.trainer - Gradient Accumulation steps = 8 06/08/2024 23:48:37 - INFO - transformers.trainer - Total optimization steps = 1,505 06/08/2024 23:48:37 - INFO - transformers.trainer - Number of trainable parameters = 494,032,768 06/08/2024 23:50:19 - INFO - llamafactory.extras.callbacks - {'loss': 3.8116, 'learning_rate': 4.9999e-05, 'epoch': 0.12, 'throughput': 1052.64} 06/08/2024 23:52:10 - INFO - llamafactory.extras.callbacks - {'loss': 3.6928, 'learning_rate': 4.9995e-05, 'epoch': 0.23, 'throughput': 1060.20} 06/08/2024 23:54:31 - INFO - llamafactory.extras.callbacks - {'loss': 3.6227, 'learning_rate': 4.9988e-05, 'epoch': 0.35, 'throughput': 1024.22} 06/08/2024 23:56:20 - INFO - llamafactory.extras.callbacks - {'loss': 3.6010, 'learning_rate': 4.9978e-05, 'epoch': 0.46, 'throughput': 1032.88} 06/08/2024 23:58:03 - INFO - llamafactory.extras.callbacks - {'loss': 3.5390, 'learning_rate': 4.9966e-05, 'epoch': 0.58, 'throughput': 1040.84} 06/08/2024 23:59:56 - INFO - llamafactory.extras.callbacks - {'loss': 3.4956, 'learning_rate': 4.9951e-05, 'epoch': 0.70, 'throughput': 1040.12} 06/09/2024 00:02:30 - INFO - llamafactory.extras.callbacks - {'loss': 3.5044, 'learning_rate': 4.9933e-05, 'epoch': 0.81, 'throughput': 1024.54} 06/09/2024 00:04:46 - INFO - llamafactory.extras.callbacks - {'loss': 3.4324, 'learning_rate': 4.9913e-05, 'epoch': 0.93, 'throughput': 1022.18} 06/09/2024 00:07:00 - INFO - llamafactory.extras.callbacks - {'loss': 3.2542, 'learning_rate': 4.9890e-05, 'epoch': 1.04, 'throughput': 1021.10} 06/09/2024 00:08:58 - INFO - llamafactory.extras.callbacks - {'loss': 2.9024, 'learning_rate': 4.9864e-05, 'epoch': 1.16, 'throughput': 1023.49} 06/09/2024 00:11:02 - INFO - llamafactory.extras.callbacks - {'loss': 2.8069, 'learning_rate': 4.9835e-05, 'epoch': 1.28, 'throughput': 1022.63} 06/09/2024 00:12:57 - INFO - llamafactory.extras.callbacks - {'loss': 2.7500, 'learning_rate': 4.9804e-05, 'epoch': 1.39, 'throughput': 1020.77} 06/09/2024 00:14:40 - INFO - llamafactory.extras.callbacks - {'loss': 2.7329, 'learning_rate': 4.9770e-05, 'epoch': 1.51, 'throughput': 1024.15} 06/09/2024 00:17:00 - INFO - llamafactory.extras.callbacks - {'loss': 2.8212, 'learning_rate': 4.9734e-05, 'epoch': 1.62, 'throughput': 1016.55} 06/09/2024 00:19:05 - INFO - llamafactory.extras.callbacks - {'loss': 2.7935, 'learning_rate': 4.9694e-05, 'epoch': 1.74, 'throughput': 1016.39} 06/09/2024 00:21:08 - INFO - llamafactory.extras.callbacks - {'loss': 2.7305, 'learning_rate': 4.9652e-05, 'epoch': 1.86, 'throughput': 1017.05} 06/09/2024 00:23:25 - INFO - llamafactory.extras.callbacks - {'loss': 2.6482, 'learning_rate': 4.9608e-05, 'epoch': 1.97, 'throughput': 1010.18} 06/09/2024 00:25:46 - INFO - llamafactory.extras.callbacks - {'loss': 2.4292, 'learning_rate': 4.9560e-05, 'epoch': 2.09, 'throughput': 1007.97} 06/09/2024 00:28:12 - INFO - llamafactory.extras.callbacks - {'loss': 2.1416, 'learning_rate': 4.9510e-05, 'epoch': 2.20, 'throughput': 1005.24} 06/09/2024 00:30:30 - INFO - llamafactory.extras.callbacks - {'loss': 2.2847, 'learning_rate': 4.9457e-05, 'epoch': 2.32, 'throughput': 1003.68} 06/09/2024 00:30:30 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-100 06/09/2024 00:30:30 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-100\config.json 06/09/2024 00:30:30 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-100\generation_config.json 06/09/2024 00:30:42 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-100\model.safetensors 06/09/2024 00:30:42 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-100\tokenizer_config.json 06/09/2024 00:30:42 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-100\special_tokens_map.json 06/09/2024 00:33:05 - INFO - llamafactory.extras.callbacks - {'loss': 2.0944, 'learning_rate': 4.9402e-05, 'epoch': 2.43, 'throughput': 990.48} 06/09/2024 00:35:29 - INFO - llamafactory.extras.callbacks - {'loss': 2.2862, 'learning_rate': 4.9344e-05, 'epoch': 2.55, 'throughput': 988.89} 06/09/2024 00:37:04 - INFO - llamafactory.extras.callbacks - {'loss': 2.1050, 'learning_rate': 4.9283e-05, 'epoch': 2.67, 'throughput': 992.54} 06/09/2024 00:38:56 - INFO - llamafactory.extras.callbacks - {'loss': 2.0287, 'learning_rate': 4.9220e-05, 'epoch': 2.78, 'throughput': 992.56} 06/09/2024 00:40:32 - INFO - llamafactory.extras.callbacks - {'loss': 2.0162, 'learning_rate': 4.9154e-05, 'epoch': 2.90, 'throughput': 995.97} 06/09/2024 00:42:39 - INFO - llamafactory.extras.callbacks - {'loss': 2.2464, 'learning_rate': 4.9085e-05, 'epoch': 3.01, 'throughput': 997.54} 06/09/2024 00:44:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.6561, 'learning_rate': 4.9014e-05, 'epoch': 3.13, 'throughput': 999.52} 06/09/2024 00:46:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.6113, 'learning_rate': 4.8940e-05, 'epoch': 3.25, 'throughput': 1000.08} 06/09/2024 00:48:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.6235, 'learning_rate': 4.8864e-05, 'epoch': 3.36, 'throughput': 998.34} 06/09/2024 00:51:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.7873, 'learning_rate': 4.8784e-05, 'epoch': 3.48, 'throughput': 997.72} 06/09/2024 00:53:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.3723, 'learning_rate': 4.8703e-05, 'epoch': 3.59, 'throughput': 995.58} 06/09/2024 00:55:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.6512, 'learning_rate': 4.8619e-05, 'epoch': 3.71, 'throughput': 997.10} 06/09/2024 00:57:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.5524, 'learning_rate': 4.8532e-05, 'epoch': 3.83, 'throughput': 997.64} 06/09/2024 00:59:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.5954, 'learning_rate': 4.8442e-05, 'epoch': 3.94, 'throughput': 999.10} 06/09/2024 01:00:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.1652, 'learning_rate': 4.8350e-05, 'epoch': 4.06, 'throughput': 1001.11} 06/09/2024 01:03:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.1995, 'learning_rate': 4.8256e-05, 'epoch': 4.17, 'throughput': 1000.51} 06/09/2024 01:05:06 - INFO - llamafactory.extras.callbacks - {'loss': 0.9431, 'learning_rate': 4.8159e-05, 'epoch': 4.29, 'throughput': 999.58} 06/09/2024 01:07:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.1136, 'learning_rate': 4.8059e-05, 'epoch': 4.41, 'throughput': 998.32} 06/09/2024 01:09:12 - INFO - llamafactory.extras.callbacks - {'loss': 1.1509, 'learning_rate': 4.7957e-05, 'epoch': 4.52, 'throughput': 999.32} 06/09/2024 01:10:56 - INFO - llamafactory.extras.callbacks - {'loss': 0.9962, 'learning_rate': 4.7853e-05, 'epoch': 4.64, 'throughput': 1000.59} 06/09/2024 01:10:56 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-200 06/09/2024 01:10:56 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-200\config.json 06/09/2024 01:10:56 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-200\generation_config.json 06/09/2024 01:11:07 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-200\model.safetensors 06/09/2024 01:11:07 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-200\tokenizer_config.json 06/09/2024 01:11:07 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-200\special_tokens_map.json 06/09/2024 01:13:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.2377, 'learning_rate': 4.7746e-05, 'epoch': 4.75, 'throughput': 996.13} 06/09/2024 01:15:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.2340, 'learning_rate': 4.7636e-05, 'epoch': 4.87, 'throughput': 996.52} 06/09/2024 01:17:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.1453, 'learning_rate': 4.7524e-05, 'epoch': 4.99, 'throughput': 998.06} 06/09/2024 01:20:08 - INFO - llamafactory.extras.callbacks - {'loss': 0.9056, 'learning_rate': 4.7410e-05, 'epoch': 5.10, 'throughput': 997.31} 06/09/2024 01:21:58 - INFO - llamafactory.extras.callbacks - {'loss': 0.7349, 'learning_rate': 4.7293e-05, 'epoch': 5.22, 'throughput': 998.01} 06/09/2024 01:23:54 - INFO - llamafactory.extras.callbacks - {'loss': 0.6968, 'learning_rate': 4.7174e-05, 'epoch': 5.33, 'throughput': 999.00} 06/09/2024 01:26:19 - INFO - llamafactory.extras.callbacks - {'loss': 0.8510, 'learning_rate': 4.7052e-05, 'epoch': 5.45, 'throughput': 998.45} 06/09/2024 01:28:14 - INFO - llamafactory.extras.callbacks - {'loss': 0.7160, 'learning_rate': 4.6928e-05, 'epoch': 5.57, 'throughput': 999.21} 06/09/2024 01:30:03 - INFO - llamafactory.extras.callbacks - {'loss': 0.6384, 'learning_rate': 4.6801e-05, 'epoch': 5.68, 'throughput': 1000.40} 06/09/2024 01:32:24 - INFO - llamafactory.extras.callbacks - {'loss': 0.7799, 'learning_rate': 4.6672e-05, 'epoch': 5.80, 'throughput': 998.13} 06/09/2024 01:34:26 - INFO - llamafactory.extras.callbacks - {'loss': 0.7923, 'learning_rate': 4.6541e-05, 'epoch': 5.91, 'throughput': 997.95} 06/09/2024 01:36:29 - INFO - llamafactory.extras.callbacks - {'loss': 0.7452, 'learning_rate': 4.6408e-05, 'epoch': 6.03, 'throughput': 998.69} 06/09/2024 01:38:14 - INFO - llamafactory.extras.callbacks - {'loss': 0.4085, 'learning_rate': 4.6272e-05, 'epoch': 6.14, 'throughput': 999.98} 06/09/2024 01:40:29 - INFO - llamafactory.extras.callbacks - {'loss': 0.5777, 'learning_rate': 4.6133e-05, 'epoch': 6.26, 'throughput': 1000.40} 06/09/2024 01:42:32 - INFO - llamafactory.extras.callbacks - {'loss': 0.4718, 'learning_rate': 4.5993e-05, 'epoch': 6.38, 'throughput': 1000.87} 06/09/2024 01:44:17 - INFO - llamafactory.extras.callbacks - {'loss': 0.3723, 'learning_rate': 4.5850e-05, 'epoch': 6.49, 'throughput': 1002.09} 06/09/2024 01:46:27 - INFO - llamafactory.extras.callbacks - {'loss': 0.4453, 'learning_rate': 4.5705e-05, 'epoch': 6.61, 'throughput': 1000.84} 06/09/2024 01:48:29 - INFO - llamafactory.extras.callbacks - {'loss': 0.5742, 'learning_rate': 4.5557e-05, 'epoch': 6.72, 'throughput': 1001.38} 06/09/2024 01:50:33 - INFO - llamafactory.extras.callbacks - {'loss': 0.5291, 'learning_rate': 4.5408e-05, 'epoch': 6.84, 'throughput': 1001.95} 06/09/2024 01:52:27 - INFO - llamafactory.extras.callbacks - {'loss': 0.4968, 'learning_rate': 4.5256e-05, 'epoch': 6.96, 'throughput': 1002.60} 06/09/2024 01:52:27 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-300 06/09/2024 01:52:27 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-300\config.json 06/09/2024 01:52:27 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-300\generation_config.json 06/09/2024 01:52:42 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-300\model.safetensors 06/09/2024 01:52:42 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-300\tokenizer_config.json 06/09/2024 01:52:42 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-300\special_tokens_map.json 06/09/2024 01:54:56 - INFO - llamafactory.extras.callbacks - {'loss': 0.4210, 'learning_rate': 4.5102e-05, 'epoch': 7.07, 'throughput': 999.94} 06/09/2024 01:57:04 - INFO - llamafactory.extras.callbacks - {'loss': 0.2767, 'learning_rate': 4.4946e-05, 'epoch': 7.19, 'throughput': 998.07} 06/09/2024 01:59:02 - INFO - llamafactory.extras.callbacks - {'loss': 0.2922, 'learning_rate': 4.4787e-05, 'epoch': 7.30, 'throughput': 998.79} 06/09/2024 02:00:54 - INFO - llamafactory.extras.callbacks - {'loss': 0.2768, 'learning_rate': 4.4627e-05, 'epoch': 7.42, 'throughput': 998.57} 06/09/2024 02:03:27 - INFO - llamafactory.extras.callbacks - {'loss': 0.3281, 'learning_rate': 4.4464e-05, 'epoch': 7.54, 'throughput': 995.98} 06/09/2024 02:05:38 - INFO - llamafactory.extras.callbacks - {'loss': 0.3374, 'learning_rate': 4.4299e-05, 'epoch': 7.65, 'throughput': 995.98} 06/09/2024 02:07:32 - INFO - llamafactory.extras.callbacks - {'loss': 0.2491, 'learning_rate': 4.4132e-05, 'epoch': 7.77, 'throughput': 996.31} 06/09/2024 02:09:47 - INFO - llamafactory.extras.callbacks - {'loss': 0.3893, 'learning_rate': 4.3963e-05, 'epoch': 7.88, 'throughput': 996.21} 06/09/2024 02:12:03 - INFO - llamafactory.extras.callbacks - {'loss': 0.3761, 'learning_rate': 4.3792e-05, 'epoch': 8.00, 'throughput': 996.14} 06/09/2024 02:14:16 - INFO - llamafactory.extras.callbacks - {'loss': 0.2360, 'learning_rate': 4.3619e-05, 'epoch': 8.12, 'throughput': 996.45} 06/09/2024 02:16:14 - INFO - llamafactory.extras.callbacks - {'loss': 0.1584, 'learning_rate': 4.3444e-05, 'epoch': 8.23, 'throughput': 996.43} 06/09/2024 02:17:54 - INFO - llamafactory.extras.callbacks - {'loss': 0.1481, 'learning_rate': 4.3267e-05, 'epoch': 8.35, 'throughput': 997.27} 06/09/2024 02:19:57 - INFO - llamafactory.extras.callbacks - {'loss': 0.1743, 'learning_rate': 4.3088e-05, 'epoch': 8.46, 'throughput': 996.84} 06/09/2024 02:22:22 - INFO - llamafactory.extras.callbacks - {'loss': 0.3162, 'learning_rate': 4.2907e-05, 'epoch': 8.58, 'throughput': 996.95} 06/09/2024 02:24:19 - INFO - llamafactory.extras.callbacks - {'loss': 0.2235, 'learning_rate': 4.2724e-05, 'epoch': 8.70, 'throughput': 997.40} 06/09/2024 02:26:35 - INFO - llamafactory.extras.callbacks - {'loss': 0.2504, 'learning_rate': 4.2539e-05, 'epoch': 8.81, 'throughput': 997.16} 06/09/2024 02:28:41 - INFO - llamafactory.extras.callbacks - {'loss': 0.2136, 'learning_rate': 4.2352e-05, 'epoch': 8.93, 'throughput': 997.85} 06/09/2024 02:30:12 - INFO - llamafactory.extras.callbacks - {'loss': 0.1047, 'learning_rate': 4.2163e-05, 'epoch': 9.04, 'throughput': 998.27} 06/09/2024 02:31:55 - INFO - llamafactory.extras.callbacks - {'loss': 0.1015, 'learning_rate': 4.1972e-05, 'epoch': 9.16, 'throughput': 999.00} 06/09/2024 02:34:04 - INFO - llamafactory.extras.callbacks - {'loss': 0.1505, 'learning_rate': 4.1780e-05, 'epoch': 9.28, 'throughput': 998.92} 06/09/2024 02:34:04 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-400 06/09/2024 02:34:04 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-400\config.json 06/09/2024 02:34:04 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-400\generation_config.json 06/09/2024 02:34:12 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-400\model.safetensors 06/09/2024 02:34:12 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-400\tokenizer_config.json 06/09/2024 02:34:12 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-400\special_tokens_map.json 06/09/2024 02:36:29 - INFO - llamafactory.extras.callbacks - {'loss': 0.1135, 'learning_rate': 4.1586e-05, 'epoch': 9.39, 'throughput': 997.11} 06/09/2024 02:38:38 - INFO - llamafactory.extras.callbacks - {'loss': 0.1832, 'learning_rate': 4.1389e-05, 'epoch': 9.51, 'throughput': 997.17} 06/09/2024 02:40:19 - INFO - llamafactory.extras.callbacks - {'loss': 0.1178, 'learning_rate': 4.1192e-05, 'epoch': 9.62, 'throughput': 998.04} 06/09/2024 02:42:54 - INFO - llamafactory.extras.callbacks - {'loss': 0.1426, 'learning_rate': 4.0992e-05, 'epoch': 9.74, 'throughput': 996.34} 06/09/2024 02:45:09 - INFO - llamafactory.extras.callbacks - {'loss': 0.1603, 'learning_rate': 4.0790e-05, 'epoch': 9.86, 'throughput': 996.80} 06/09/2024 02:47:19 - INFO - llamafactory.extras.callbacks - {'loss': 0.1548, 'learning_rate': 4.0587e-05, 'epoch': 9.97, 'throughput': 996.01} 06/09/2024 02:49:42 - INFO - llamafactory.extras.callbacks - {'loss': 0.1407, 'learning_rate': 4.0382e-05, 'epoch': 10.09, 'throughput': 995.97} 06/09/2024 02:51:53 - INFO - llamafactory.extras.callbacks - {'loss': 0.0791, 'learning_rate': 4.0176e-05, 'epoch': 10.20, 'throughput': 995.82} 06/09/2024 02:54:04 - INFO - llamafactory.extras.callbacks - {'loss': 0.0722, 'learning_rate': 3.9968e-05, 'epoch': 10.32, 'throughput': 995.11} 06/09/2024 02:55:47 - INFO - llamafactory.extras.callbacks - {'loss': 0.0655, 'learning_rate': 3.9758e-05, 'epoch': 10.43, 'throughput': 995.82} 06/09/2024 02:57:40 - INFO - llamafactory.extras.callbacks - {'loss': 0.0723, 'learning_rate': 3.9546e-05, 'epoch': 10.55, 'throughput': 996.49} 06/09/2024 02:59:24 - INFO - llamafactory.extras.callbacks - {'loss': 0.0655, 'learning_rate': 3.9333e-05, 'epoch': 10.67, 'throughput': 997.33} 06/09/2024 03:01:52 - INFO - llamafactory.extras.callbacks - {'loss': 0.1534, 'learning_rate': 3.9119e-05, 'epoch': 10.78, 'throughput': 996.60} 06/09/2024 03:03:55 - INFO - llamafactory.extras.callbacks - {'loss': 0.0947, 'learning_rate': 3.8903e-05, 'epoch': 10.90, 'throughput': 996.89} 06/09/2024 03:05:59 - INFO - llamafactory.extras.callbacks - {'loss': 0.0868, 'learning_rate': 3.8685e-05, 'epoch': 11.01, 'throughput': 997.29} 06/09/2024 03:08:05 - INFO - llamafactory.extras.callbacks - {'loss': 0.0467, 'learning_rate': 3.8466e-05, 'epoch': 11.13, 'throughput': 997.52} 06/09/2024 03:09:48 - INFO - llamafactory.extras.callbacks - {'loss': 0.0521, 'learning_rate': 3.8246e-05, 'epoch': 11.25, 'throughput': 998.15} 06/09/2024 03:11:59 - INFO - llamafactory.extras.callbacks - {'loss': 0.0510, 'learning_rate': 3.8024e-05, 'epoch': 11.36, 'throughput': 998.66} 06/09/2024 03:13:59 - INFO - llamafactory.extras.callbacks - {'loss': 0.0753, 'learning_rate': 3.7800e-05, 'epoch': 11.48, 'throughput': 999.10} 06/09/2024 03:16:02 - INFO - llamafactory.extras.callbacks - {'loss': 0.0601, 'learning_rate': 3.7575e-05, 'epoch': 11.59, 'throughput': 999.57} 06/09/2024 03:16:02 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-500 06/09/2024 03:16:02 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-500\config.json 06/09/2024 03:16:02 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-500\generation_config.json 06/09/2024 03:16:09 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-500\model.safetensors 06/09/2024 03:16:09 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-500\tokenizer_config.json 06/09/2024 03:16:09 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-500\special_tokens_map.json 06/09/2024 03:18:09 - INFO - llamafactory.extras.callbacks - {'loss': 0.0533, 'learning_rate': 3.7349e-05, 'epoch': 11.71, 'throughput': 998.61} 06/09/2024 03:20:11 - INFO - llamafactory.extras.callbacks - {'loss': 0.0640, 'learning_rate': 3.7122e-05, 'epoch': 11.83, 'throughput': 999.10} 06/09/2024 03:22:03 - INFO - llamafactory.extras.callbacks - {'loss': 0.0755, 'learning_rate': 3.6893e-05, 'epoch': 11.94, 'throughput': 999.59} 06/09/2024 03:23:54 - INFO - llamafactory.extras.callbacks - {'loss': 0.0553, 'learning_rate': 3.6662e-05, 'epoch': 12.06, 'throughput': 1000.07} 06/09/2024 03:26:08 - INFO - llamafactory.extras.callbacks - {'loss': 0.0355, 'learning_rate': 3.6431e-05, 'epoch': 12.17, 'throughput': 999.57} 06/09/2024 03:28:14 - INFO - llamafactory.extras.callbacks - {'loss': 0.0413, 'learning_rate': 3.6198e-05, 'epoch': 12.29, 'throughput': 999.88} 06/09/2024 03:30:05 - INFO - llamafactory.extras.callbacks - {'loss': 0.0596, 'learning_rate': 3.5964e-05, 'epoch': 12.41, 'throughput': 1000.36} 06/09/2024 03:32:06 - INFO - llamafactory.extras.callbacks - {'loss': 0.0479, 'learning_rate': 3.5729e-05, 'epoch': 12.52, 'throughput': 1000.70} 06/09/2024 03:34:19 - INFO - llamafactory.extras.callbacks - {'loss': 0.0470, 'learning_rate': 3.5493e-05, 'epoch': 12.64, 'throughput': 1000.73} 06/09/2024 03:36:18 - INFO - llamafactory.extras.callbacks - {'loss': 0.0492, 'learning_rate': 3.5256e-05, 'epoch': 12.75, 'throughput': 1001.04} 06/09/2024 03:38:16 - INFO - llamafactory.extras.callbacks - {'loss': 0.0471, 'learning_rate': 3.5017e-05, 'epoch': 12.87, 'throughput': 1001.36} 06/09/2024 03:40:25 - INFO - llamafactory.extras.callbacks - {'loss': 0.1041, 'learning_rate': 3.4778e-05, 'epoch': 12.99, 'throughput': 1000.82} 06/09/2024 03:42:14 - INFO - llamafactory.extras.callbacks - {'loss': 0.0282, 'learning_rate': 3.4537e-05, 'epoch': 13.10, 'throughput': 1001.50} 06/09/2024 03:44:37 - INFO - llamafactory.extras.callbacks - {'loss': 0.0272, 'learning_rate': 3.4295e-05, 'epoch': 13.22, 'throughput': 1000.51} 06/09/2024 03:47:00 - INFO - llamafactory.extras.callbacks - {'loss': 0.0342, 'learning_rate': 3.4053e-05, 'epoch': 13.33, 'throughput': 1000.60} 06/09/2024 03:48:31 - INFO - llamafactory.extras.callbacks - {'loss': 0.0855, 'learning_rate': 3.3809e-05, 'epoch': 13.45, 'throughput': 1000.75} 06/09/2024 03:50:33 - INFO - llamafactory.extras.callbacks - {'loss': 0.0378, 'learning_rate': 3.3564e-05, 'epoch': 13.57, 'throughput': 1001.03} 06/09/2024 03:52:10 - INFO - llamafactory.extras.callbacks - {'loss': 0.0457, 'learning_rate': 3.3319e-05, 'epoch': 13.68, 'throughput': 1001.65} 06/09/2024 03:54:42 - INFO - llamafactory.extras.callbacks - {'loss': 0.0453, 'learning_rate': 3.3072e-05, 'epoch': 13.80, 'throughput': 1001.53} 06/09/2024 03:56:41 - INFO - llamafactory.extras.callbacks - {'loss': 0.0378, 'learning_rate': 3.2825e-05, 'epoch': 13.91, 'throughput': 1001.72} 06/09/2024 03:56:41 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-600 06/09/2024 03:56:41 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-600\config.json 06/09/2024 03:56:41 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-600\generation_config.json 06/09/2024 03:57:01 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-600\model.safetensors 06/09/2024 03:57:01 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-600\tokenizer_config.json 06/09/2024 03:57:01 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-600\special_tokens_map.json 06/09/2024 03:59:09 - INFO - llamafactory.extras.callbacks - {'loss': 0.0296, 'learning_rate': 3.2576e-05, 'epoch': 14.03, 'throughput': 999.94} 06/09/2024 04:01:17 - INFO - llamafactory.extras.callbacks - {'loss': 0.0216, 'learning_rate': 3.2327e-05, 'epoch': 14.14, 'throughput': 999.25} 06/09/2024 04:03:35 - INFO - llamafactory.extras.callbacks - {'loss': 0.0299, 'learning_rate': 3.2077e-05, 'epoch': 14.26, 'throughput': 998.89} 06/09/2024 04:05:32 - INFO - llamafactory.extras.callbacks - {'loss': 0.0247, 'learning_rate': 3.1827e-05, 'epoch': 14.38, 'throughput': 998.88} 06/09/2024 04:08:01 - INFO - llamafactory.extras.callbacks - {'loss': 0.0455, 'learning_rate': 3.1575e-05, 'epoch': 14.49, 'throughput': 997.37} 06/09/2024 04:09:49 - INFO - llamafactory.extras.callbacks - {'loss': 0.0262, 'learning_rate': 3.1323e-05, 'epoch': 14.61, 'throughput': 997.84} 06/09/2024 04:11:59 - INFO - llamafactory.extras.callbacks - {'loss': 0.0281, 'learning_rate': 3.1071e-05, 'epoch': 14.72, 'throughput': 998.19} 06/09/2024 04:13:52 - INFO - llamafactory.extras.callbacks - {'loss': 0.0294, 'learning_rate': 3.0817e-05, 'epoch': 14.84, 'throughput': 998.63} 06/09/2024 04:15:58 - INFO - llamafactory.extras.callbacks - {'loss': 0.0291, 'learning_rate': 3.0563e-05, 'epoch': 14.96, 'throughput': 998.97} 06/09/2024 04:18:05 - INFO - llamafactory.extras.callbacks - {'loss': 0.0216, 'learning_rate': 3.0308e-05, 'epoch': 15.07, 'throughput': 999.01} 06/09/2024 04:20:32 - INFO - llamafactory.extras.callbacks - {'loss': 0.0218, 'learning_rate': 3.0053e-05, 'epoch': 15.19, 'throughput': 997.79} 06/09/2024 04:22:51 - INFO - llamafactory.extras.callbacks - {'loss': 0.0283, 'learning_rate': 2.9797e-05, 'epoch': 15.30, 'throughput': 997.64} 06/09/2024 04:24:40 - INFO - llamafactory.extras.callbacks - {'loss': 0.0259, 'learning_rate': 2.9541e-05, 'epoch': 15.42, 'throughput': 997.98} 06/09/2024 04:26:42 - INFO - llamafactory.extras.callbacks - {'loss': 0.0716, 'learning_rate': 2.9284e-05, 'epoch': 15.54, 'throughput': 998.25} 06/09/2024 04:29:05 - INFO - llamafactory.extras.callbacks - {'loss': 0.0250, 'learning_rate': 2.9027e-05, 'epoch': 15.65, 'throughput': 998.26} 06/09/2024 04:31:00 - INFO - llamafactory.extras.callbacks - {'loss': 0.0218, 'learning_rate': 2.8769e-05, 'epoch': 15.77, 'throughput': 998.47} 06/09/2024 04:32:58 - INFO - llamafactory.extras.callbacks - {'loss': 0.0231, 'learning_rate': 2.8511e-05, 'epoch': 15.88, 'throughput': 998.80} 06/09/2024 04:34:35 - INFO - llamafactory.extras.callbacks - {'loss': 0.0228, 'learning_rate': 2.8252e-05, 'epoch': 16.00, 'throughput': 999.30} 06/09/2024 04:36:56 - INFO - llamafactory.extras.callbacks - {'loss': 0.0290, 'learning_rate': 2.7993e-05, 'epoch': 16.12, 'throughput': 999.19} 06/09/2024 04:39:18 - INFO - llamafactory.extras.callbacks - {'loss': 0.0199, 'learning_rate': 2.7734e-05, 'epoch': 16.23, 'throughput': 999.18} 06/09/2024 04:39:18 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-700 06/09/2024 04:39:18 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-700\config.json 06/09/2024 04:39:18 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-700\generation_config.json 06/09/2024 04:39:31 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-700\model.safetensors 06/09/2024 04:39:31 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-700\tokenizer_config.json 06/09/2024 04:39:31 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-700\special_tokens_map.json 06/09/2024 04:41:37 - INFO - llamafactory.extras.callbacks - {'loss': 0.0259, 'learning_rate': 2.7475e-05, 'epoch': 16.35, 'throughput': 997.81} 06/09/2024 04:43:43 - INFO - llamafactory.extras.callbacks - {'loss': 0.0210, 'learning_rate': 2.7215e-05, 'epoch': 16.46, 'throughput': 997.83} 06/09/2024 04:45:37 - INFO - llamafactory.extras.callbacks - {'loss': 0.0199, 'learning_rate': 2.6955e-05, 'epoch': 16.58, 'throughput': 998.23} 06/09/2024 04:47:35 - INFO - llamafactory.extras.callbacks - {'loss': 0.0189, 'learning_rate': 2.6695e-05, 'epoch': 16.70, 'throughput': 998.55} 06/09/2024 04:49:15 - INFO - llamafactory.extras.callbacks - {'loss': 0.0169, 'learning_rate': 2.6434e-05, 'epoch': 16.81, 'throughput': 998.93} 06/09/2024 04:51:09 - INFO - llamafactory.extras.callbacks - {'loss': 0.0209, 'learning_rate': 2.6174e-05, 'epoch': 16.93, 'throughput': 999.34} 06/09/2024 04:53:23 - INFO - llamafactory.extras.callbacks - {'loss': 0.0201, 'learning_rate': 2.5913e-05, 'epoch': 17.04, 'throughput': 999.59} 06/09/2024 04:55:31 - INFO - llamafactory.extras.callbacks - {'loss': 0.0170, 'learning_rate': 2.5652e-05, 'epoch': 17.16, 'throughput': 999.09} 06/09/2024 04:57:15 - INFO - llamafactory.extras.callbacks - {'loss': 0.0187, 'learning_rate': 2.5391e-05, 'epoch': 17.28, 'throughput': 999.36} 06/09/2024 04:59:04 - INFO - llamafactory.extras.callbacks - {'loss': 0.0242, 'learning_rate': 2.5130e-05, 'epoch': 17.39, 'throughput': 999.72} 06/09/2024 05:01:05 - INFO - llamafactory.extras.callbacks - {'loss': 0.0153, 'learning_rate': 2.4870e-05, 'epoch': 17.51, 'throughput': 1000.04} 06/09/2024 05:03:20 - INFO - llamafactory.extras.callbacks - {'loss': 0.0182, 'learning_rate': 2.4609e-05, 'epoch': 17.62, 'throughput': 1000.12} 06/09/2024 05:05:14 - INFO - llamafactory.extras.callbacks - {'loss': 0.0143, 'learning_rate': 2.4348e-05, 'epoch': 17.74, 'throughput': 1000.34} 06/09/2024 05:07:25 - INFO - llamafactory.extras.callbacks - {'loss': 0.0164, 'learning_rate': 2.4087e-05, 'epoch': 17.86, 'throughput': 1000.46} 06/09/2024 05:09:29 - INFO - llamafactory.extras.callbacks - {'loss': 0.0157, 'learning_rate': 2.3826e-05, 'epoch': 17.97, 'throughput': 1000.43} 06/09/2024 05:11:39 - INFO - llamafactory.extras.callbacks - {'loss': 0.0304, 'learning_rate': 2.3566e-05, 'epoch': 18.09, 'throughput': 1000.34} 06/09/2024 05:14:00 - INFO - llamafactory.extras.callbacks - {'loss': 0.0090, 'learning_rate': 2.3305e-05, 'epoch': 18.20, 'throughput': 1000.45} 06/09/2024 05:15:24 - INFO - llamafactory.extras.callbacks - {'loss': 0.0105, 'learning_rate': 2.3045e-05, 'epoch': 18.32, 'throughput': 1000.89} 06/09/2024 05:17:37 - INFO - llamafactory.extras.callbacks - {'loss': 0.0104, 'learning_rate': 2.2785e-05, 'epoch': 18.43, 'throughput': 1000.55} 06/09/2024 05:19:46 - INFO - llamafactory.extras.callbacks - {'loss': 0.0103, 'learning_rate': 2.2525e-05, 'epoch': 18.55, 'throughput': 1000.55} 06/09/2024 05:19:46 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-800 06/09/2024 05:19:46 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-800\config.json 06/09/2024 05:19:46 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-800\generation_config.json 06/09/2024 05:19:58 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-800\model.safetensors 06/09/2024 05:19:58 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-800\tokenizer_config.json 06/09/2024 05:19:58 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-800\special_tokens_map.json 06/09/2024 05:22:27 - INFO - llamafactory.extras.callbacks - {'loss': 0.0107, 'learning_rate': 2.2266e-05, 'epoch': 18.67, 'throughput': 999.47} 06/09/2024 05:24:25 - INFO - llamafactory.extras.callbacks - {'loss': 0.0269, 'learning_rate': 2.2007e-05, 'epoch': 18.78, 'throughput': 999.49} 06/09/2024 05:26:24 - INFO - llamafactory.extras.callbacks - {'loss': 0.0086, 'learning_rate': 2.1748e-05, 'epoch': 18.90, 'throughput': 999.70} 06/09/2024 05:28:13 - INFO - llamafactory.extras.callbacks - {'loss': 0.0133, 'learning_rate': 2.1489e-05, 'epoch': 19.01, 'throughput': 999.96} 06/09/2024 05:29:46 - INFO - llamafactory.extras.callbacks - {'loss': 0.0114, 'learning_rate': 2.1231e-05, 'epoch': 19.13, 'throughput': 1000.41} 06/09/2024 05:31:50 - INFO - llamafactory.extras.callbacks - {'loss': 0.0070, 'learning_rate': 2.0973e-05, 'epoch': 19.25, 'throughput': 1000.64} 06/09/2024 05:33:47 - INFO - llamafactory.extras.callbacks - {'loss': 0.0088, 'learning_rate': 2.0716e-05, 'epoch': 19.36, 'throughput': 1000.77} 06/09/2024 05:36:12 - INFO - llamafactory.extras.callbacks - {'loss': 0.0070, 'learning_rate': 2.0459e-05, 'epoch': 19.48, 'throughput': 1000.49} 06/09/2024 05:37:59 - INFO - llamafactory.extras.callbacks - {'loss': 0.0059, 'learning_rate': 2.0203e-05, 'epoch': 19.59, 'throughput': 1000.78} 06/09/2024 05:39:59 - INFO - llamafactory.extras.callbacks - {'loss': 0.0069, 'learning_rate': 1.9947e-05, 'epoch': 19.71, 'throughput': 1000.99} 06/09/2024 05:42:18 - INFO - llamafactory.extras.callbacks - {'loss': 0.0080, 'learning_rate': 1.9692e-05, 'epoch': 19.83, 'throughput': 1001.10} 06/09/2024 05:44:15 - INFO - llamafactory.extras.callbacks - {'loss': 0.0127, 'learning_rate': 1.9437e-05, 'epoch': 19.94, 'throughput': 1001.23} 06/09/2024 05:46:30 - INFO - llamafactory.extras.callbacks - {'loss': 0.0059, 'learning_rate': 1.9183e-05, 'epoch': 20.06, 'throughput': 1001.07} 06/09/2024 05:48:17 - INFO - llamafactory.extras.callbacks - {'loss': 0.0046, 'learning_rate': 1.8929e-05, 'epoch': 20.17, 'throughput': 1001.07} 06/09/2024 05:50:34 - INFO - llamafactory.extras.callbacks - {'loss': 0.0058, 'learning_rate': 1.8677e-05, 'epoch': 20.29, 'throughput': 1001.15} 06/09/2024 05:52:48 - INFO - llamafactory.extras.callbacks - {'loss': 0.0082, 'learning_rate': 1.8425e-05, 'epoch': 20.41, 'throughput': 1000.86} 06/09/2024 05:54:46 - INFO - llamafactory.extras.callbacks - {'loss': 0.0148, 'learning_rate': 1.8173e-05, 'epoch': 20.52, 'throughput': 1001.09} 06/09/2024 05:56:27 - INFO - llamafactory.extras.callbacks - {'loss': 0.0064, 'learning_rate': 1.7923e-05, 'epoch': 20.64, 'throughput': 1001.43} 06/09/2024 05:58:37 - INFO - llamafactory.extras.callbacks - {'loss': 0.0080, 'learning_rate': 1.7673e-05, 'epoch': 20.75, 'throughput': 1001.37} 06/09/2024 06:00:59 - INFO - llamafactory.extras.callbacks - {'loss': 0.0053, 'learning_rate': 1.7424e-05, 'epoch': 20.87, 'throughput': 1001.39} 06/09/2024 06:00:59 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-900 06/09/2024 06:00:59 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-900\config.json 06/09/2024 06:00:59 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-900\generation_config.json 06/09/2024 06:01:08 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-900\model.safetensors 06/09/2024 06:01:08 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-900\tokenizer_config.json 06/09/2024 06:01:08 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-900\special_tokens_map.json 06/09/2024 06:03:11 - INFO - llamafactory.extras.callbacks - {'loss': 0.0054, 'learning_rate': 1.7175e-05, 'epoch': 20.99, 'throughput': 1000.86} 06/09/2024 06:05:26 - INFO - llamafactory.extras.callbacks - {'loss': 0.0044, 'learning_rate': 1.6928e-05, 'epoch': 21.10, 'throughput': 1000.89} 06/09/2024 06:07:35 - INFO - llamafactory.extras.callbacks - {'loss': 0.0043, 'learning_rate': 1.6681e-05, 'epoch': 21.22, 'throughput': 1000.65} 06/09/2024 06:09:39 - INFO - llamafactory.extras.callbacks - {'loss': 0.0035, 'learning_rate': 1.6436e-05, 'epoch': 21.33, 'throughput': 1000.81} 06/09/2024 06:11:17 - INFO - llamafactory.extras.callbacks - {'loss': 0.0044, 'learning_rate': 1.6191e-05, 'epoch': 21.45, 'throughput': 1001.21} 06/09/2024 06:13:17 - INFO - llamafactory.extras.callbacks - {'loss': 0.0178, 'learning_rate': 1.5947e-05, 'epoch': 21.57, 'throughput': 1001.45} 06/09/2024 06:14:58 - INFO - llamafactory.extras.callbacks - {'loss': 0.0089, 'learning_rate': 1.5705e-05, 'epoch': 21.68, 'throughput': 1001.76} 06/09/2024 06:17:41 - INFO - llamafactory.extras.callbacks - {'loss': 0.0040, 'learning_rate': 1.5463e-05, 'epoch': 21.80, 'throughput': 1001.09} 06/09/2024 06:19:50 - INFO - llamafactory.extras.callbacks - {'loss': 0.0040, 'learning_rate': 1.5222e-05, 'epoch': 21.91, 'throughput': 1000.93} 06/09/2024 06:21:39 - INFO - llamafactory.extras.callbacks - {'loss': 0.0034, 'learning_rate': 1.4983e-05, 'epoch': 22.03, 'throughput': 1001.01} 06/09/2024 06:23:20 - INFO - llamafactory.extras.callbacks - {'loss': 0.0066, 'learning_rate': 1.4744e-05, 'epoch': 22.14, 'throughput': 1001.40} 06/09/2024 06:25:22 - INFO - llamafactory.extras.callbacks - {'loss': 0.0034, 'learning_rate': 1.4507e-05, 'epoch': 22.26, 'throughput': 1001.54} 06/09/2024 06:27:14 - INFO - llamafactory.extras.callbacks - {'loss': 0.0032, 'learning_rate': 1.4271e-05, 'epoch': 22.38, 'throughput': 1001.84} 06/09/2024 06:29:01 - INFO - llamafactory.extras.callbacks - {'loss': 0.0124, 'learning_rate': 1.4036e-05, 'epoch': 22.49, 'throughput': 1002.12} 06/09/2024 06:31:31 - INFO - llamafactory.extras.callbacks - {'loss': 0.0027, 'learning_rate': 1.3802e-05, 'epoch': 22.61, 'throughput': 1001.67} 06/09/2024 06:33:18 - INFO - llamafactory.extras.callbacks - {'loss': 0.0037, 'learning_rate': 1.3569e-05, 'epoch': 22.72, 'throughput': 1001.90} 06/09/2024 06:35:28 - INFO - llamafactory.extras.callbacks - {'loss': 0.0029, 'learning_rate': 1.3338e-05, 'epoch': 22.84, 'throughput': 1001.88} 06/09/2024 06:38:19 - INFO - llamafactory.extras.callbacks - {'loss': 0.0035, 'learning_rate': 1.3107e-05, 'epoch': 22.96, 'throughput': 1001.00} 06/09/2024 06:39:55 - INFO - llamafactory.extras.callbacks - {'loss': 0.0020, 'learning_rate': 1.2878e-05, 'epoch': 23.07, 'throughput': 1001.32} 06/09/2024 06:41:50 - INFO - llamafactory.extras.callbacks - {'loss': 0.0023, 'learning_rate': 1.2651e-05, 'epoch': 23.19, 'throughput': 1001.57} 06/09/2024 06:41:50 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1000 06/09/2024 06:41:50 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1000\config.json 06/09/2024 06:41:50 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1000\generation_config.json 06/09/2024 06:42:07 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1000\model.safetensors 06/09/2024 06:42:07 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1000\tokenizer_config.json 06/09/2024 06:42:07 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1000\special_tokens_map.json 06/09/2024 06:44:12 - INFO - llamafactory.extras.callbacks - {'loss': 0.0031, 'learning_rate': 1.2425e-05, 'epoch': 23.30, 'throughput': 1000.67} 06/09/2024 06:46:33 - INFO - llamafactory.extras.callbacks - {'loss': 0.0027, 'learning_rate': 1.2200e-05, 'epoch': 23.42, 'throughput': 1000.50} 06/09/2024 06:48:36 - INFO - llamafactory.extras.callbacks - {'loss': 0.0124, 'learning_rate': 1.1976e-05, 'epoch': 23.54, 'throughput': 1000.70} 06/09/2024 06:50:18 - INFO - llamafactory.extras.callbacks - {'loss': 0.0023, 'learning_rate': 1.1754e-05, 'epoch': 23.65, 'throughput': 1001.02} 06/09/2024 06:52:50 - INFO - llamafactory.extras.callbacks - {'loss': 0.0027, 'learning_rate': 1.1534e-05, 'epoch': 23.77, 'throughput': 1000.96} 06/09/2024 06:54:48 - INFO - llamafactory.extras.callbacks - {'loss': 0.0041, 'learning_rate': 1.1315e-05, 'epoch': 23.88, 'throughput': 1001.18} 06/09/2024 06:56:41 - INFO - llamafactory.extras.callbacks - {'loss': 0.0021, 'learning_rate': 1.1097e-05, 'epoch': 24.00, 'throughput': 1001.39} 06/09/2024 06:58:29 - INFO - llamafactory.extras.callbacks - {'loss': 0.0039, 'learning_rate': 1.0881e-05, 'epoch': 24.12, 'throughput': 1001.66} 06/09/2024 07:00:50 - INFO - llamafactory.extras.callbacks - {'loss': 0.0025, 'learning_rate': 1.0667e-05, 'epoch': 24.23, 'throughput': 1001.60} 06/09/2024 07:02:55 - INFO - llamafactory.extras.callbacks - {'loss': 0.0020, 'learning_rate': 1.0454e-05, 'epoch': 24.35, 'throughput': 1001.79} 06/09/2024 07:05:24 - INFO - llamafactory.extras.callbacks - {'loss': 0.0024, 'learning_rate': 1.0242e-05, 'epoch': 24.46, 'throughput': 1001.71} 06/09/2024 07:07:13 - INFO - llamafactory.extras.callbacks - {'loss': 0.0031, 'learning_rate': 1.0032e-05, 'epoch': 24.58, 'throughput': 1001.93} 06/09/2024 07:09:07 - INFO - llamafactory.extras.callbacks - {'loss': 0.0022, 'learning_rate': 9.8241e-06, 'epoch': 24.70, 'throughput': 1002.12} 06/09/2024 07:10:53 - INFO - llamafactory.extras.callbacks - {'loss': 0.0022, 'learning_rate': 9.6176e-06, 'epoch': 24.81, 'throughput': 1002.46} 06/09/2024 07:13:23 - INFO - llamafactory.extras.callbacks - {'loss': 0.0109, 'learning_rate': 9.4128e-06, 'epoch': 24.93, 'throughput': 1001.58} 06/09/2024 07:15:05 - INFO - llamafactory.extras.callbacks - {'loss': 0.0019, 'learning_rate': 9.2096e-06, 'epoch': 25.04, 'throughput': 1001.78} 06/09/2024 07:16:58 - INFO - llamafactory.extras.callbacks - {'loss': 0.0016, 'learning_rate': 9.0082e-06, 'epoch': 25.16, 'throughput': 1002.01} 06/09/2024 07:18:57 - INFO - llamafactory.extras.callbacks - {'loss': 0.0020, 'learning_rate': 8.8085e-06, 'epoch': 25.28, 'throughput': 1002.15} 06/09/2024 07:20:46 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 8.6106e-06, 'epoch': 25.39, 'throughput': 1002.35} 06/09/2024 07:22:55 - INFO - llamafactory.extras.callbacks - {'loss': 0.0028, 'learning_rate': 8.4144e-06, 'epoch': 25.51, 'throughput': 1002.44} 06/09/2024 07:22:55 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1100 06/09/2024 07:22:55 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1100\config.json 06/09/2024 07:22:55 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1100\generation_config.json 06/09/2024 07:23:26 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1100\model.safetensors 06/09/2024 07:23:26 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1100\tokenizer_config.json 06/09/2024 07:23:26 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1100\special_tokens_map.json 06/09/2024 07:27:26 - INFO - llamafactory.extras.callbacks - {'loss': 0.0021, 'learning_rate': 8.2201e-06, 'epoch': 25.62, 'throughput': 998.15} 06/09/2024 07:29:14 - INFO - llamafactory.extras.callbacks - {'loss': 0.0057, 'learning_rate': 8.0276e-06, 'epoch': 25.74, 'throughput': 998.39} 06/09/2024 07:31:09 - INFO - llamafactory.extras.callbacks - {'loss': 0.0067, 'learning_rate': 7.8369e-06, 'epoch': 25.86, 'throughput': 998.59} 06/09/2024 07:33:09 - INFO - llamafactory.extras.callbacks - {'loss': 0.0020, 'learning_rate': 7.6481e-06, 'epoch': 25.97, 'throughput': 998.67} 06/09/2024 07:35:20 - INFO - llamafactory.extras.callbacks - {'loss': 0.0021, 'learning_rate': 7.4612e-06, 'epoch': 26.09, 'throughput': 998.73} 06/09/2024 07:37:21 - INFO - llamafactory.extras.callbacks - {'loss': 0.0020, 'learning_rate': 7.2763e-06, 'epoch': 26.20, 'throughput': 998.92} 06/09/2024 07:39:24 - INFO - llamafactory.extras.callbacks - {'loss': 0.0017, 'learning_rate': 7.0932e-06, 'epoch': 26.32, 'throughput': 998.88} 06/09/2024 07:41:36 - INFO - llamafactory.extras.callbacks - {'loss': 0.0014, 'learning_rate': 6.9121e-06, 'epoch': 26.43, 'throughput': 998.77} 06/09/2024 07:43:49 - INFO - llamafactory.extras.callbacks - {'loss': 0.0022, 'learning_rate': 6.7330e-06, 'epoch': 26.55, 'throughput': 998.82} 06/09/2024 07:45:54 - INFO - llamafactory.extras.callbacks - {'loss': 0.0023, 'learning_rate': 6.5558e-06, 'epoch': 26.67, 'throughput': 998.48} 06/09/2024 07:47:43 - INFO - llamafactory.extras.callbacks - {'loss': 0.0028, 'learning_rate': 6.3807e-06, 'epoch': 26.78, 'throughput': 998.66} 06/09/2024 07:49:50 - INFO - llamafactory.extras.callbacks - {'loss': 0.0060, 'learning_rate': 6.2076e-06, 'epoch': 26.90, 'throughput': 998.69} 06/09/2024 07:51:36 - INFO - llamafactory.extras.callbacks - {'loss': 0.0045, 'learning_rate': 6.0365e-06, 'epoch': 27.01, 'throughput': 998.96} 06/09/2024 07:53:40 - INFO - llamafactory.extras.callbacks - {'loss': 0.0019, 'learning_rate': 5.8675e-06, 'epoch': 27.13, 'throughput': 999.15} 06/09/2024 07:55:56 - INFO - llamafactory.extras.callbacks - {'loss': 0.0020, 'learning_rate': 5.7006e-06, 'epoch': 27.25, 'throughput': 999.08} 06/09/2024 07:58:18 - INFO - llamafactory.extras.callbacks - {'loss': 0.0019, 'learning_rate': 5.5358e-06, 'epoch': 27.36, 'throughput': 998.78} 06/09/2024 08:00:40 - INFO - llamafactory.extras.callbacks - {'loss': 0.0060, 'learning_rate': 5.3731e-06, 'epoch': 27.48, 'throughput': 998.02} 06/09/2024 08:02:32 - INFO - llamafactory.extras.callbacks - {'loss': 0.0043, 'learning_rate': 5.2126e-06, 'epoch': 27.59, 'throughput': 998.17} 06/09/2024 08:04:21 - INFO - llamafactory.extras.callbacks - {'loss': 0.0019, 'learning_rate': 5.0542e-06, 'epoch': 27.71, 'throughput': 998.41} 06/09/2024 08:06:12 - INFO - llamafactory.extras.callbacks - {'loss': 0.0020, 'learning_rate': 4.8980e-06, 'epoch': 27.83, 'throughput': 998.66} 06/09/2024 08:06:12 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1200 06/09/2024 08:06:12 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1200\config.json 06/09/2024 08:06:12 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1200\generation_config.json 06/09/2024 08:06:15 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1200\model.safetensors 06/09/2024 08:06:15 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1200\tokenizer_config.json 06/09/2024 08:06:15 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1200\special_tokens_map.json 06/09/2024 08:08:33 - INFO - llamafactory.extras.callbacks - {'loss': 0.0022, 'learning_rate': 4.7439e-06, 'epoch': 27.94, 'throughput': 998.10} 06/09/2024 08:11:12 - INFO - llamafactory.extras.callbacks - {'loss': 0.0016, 'learning_rate': 4.5921e-06, 'epoch': 28.06, 'throughput': 997.51} 06/09/2024 08:13:00 - INFO - llamafactory.extras.callbacks - {'loss': 0.0033, 'learning_rate': 4.4425e-06, 'epoch': 28.17, 'throughput': 997.65} 06/09/2024 08:15:26 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 4.2952e-06, 'epoch': 28.29, 'throughput': 997.44} 06/09/2024 08:17:30 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 4.1501e-06, 'epoch': 28.41, 'throughput': 997.40} 06/09/2024 08:19:21 - INFO - llamafactory.extras.callbacks - {'loss': 0.0078, 'learning_rate': 4.0072e-06, 'epoch': 28.52, 'throughput': 997.56} 06/09/2024 08:21:37 - INFO - llamafactory.extras.callbacks - {'loss': 0.0019, 'learning_rate': 3.8667e-06, 'epoch': 28.64, 'throughput': 997.29} 06/09/2024 08:24:01 - INFO - llamafactory.extras.callbacks - {'loss': 0.0019, 'learning_rate': 3.7284e-06, 'epoch': 28.75, 'throughput': 996.51} 06/09/2024 08:25:32 - INFO - llamafactory.extras.callbacks - {'loss': 0.0027, 'learning_rate': 3.5925e-06, 'epoch': 28.87, 'throughput': 996.81} 06/09/2024 08:27:46 - INFO - llamafactory.extras.callbacks - {'loss': 0.0023, 'learning_rate': 3.4589e-06, 'epoch': 28.99, 'throughput': 996.97} 06/09/2024 08:30:00 - INFO - llamafactory.extras.callbacks - {'loss': 0.0029, 'learning_rate': 3.3277e-06, 'epoch': 29.10, 'throughput': 997.00} 06/09/2024 08:31:58 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 3.1988e-06, 'epoch': 29.22, 'throughput': 997.14} 06/09/2024 08:34:08 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 3.0723e-06, 'epoch': 29.33, 'throughput': 997.27} 06/09/2024 08:36:06 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 2.9481e-06, 'epoch': 29.45, 'throughput': 997.43} 06/09/2024 08:38:28 - INFO - llamafactory.extras.callbacks - {'loss': 0.0019, 'learning_rate': 2.8264e-06, 'epoch': 29.57, 'throughput': 997.05} 06/09/2024 08:40:32 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 2.7071e-06, 'epoch': 29.68, 'throughput': 997.23} 06/09/2024 08:42:30 - INFO - llamafactory.extras.callbacks - {'loss': 0.0023, 'learning_rate': 2.5902e-06, 'epoch': 29.80, 'throughput': 997.24} 06/09/2024 08:44:29 - INFO - llamafactory.extras.callbacks - {'loss': 0.0078, 'learning_rate': 2.4758e-06, 'epoch': 29.91, 'throughput': 997.37} 06/09/2024 08:46:27 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 2.3638e-06, 'epoch': 30.03, 'throughput': 997.36} 06/09/2024 08:48:26 - INFO - llamafactory.extras.callbacks - {'loss': 0.0017, 'learning_rate': 2.2543e-06, 'epoch': 30.14, 'throughput': 997.56} 06/09/2024 08:48:26 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1300 06/09/2024 08:48:26 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1300\config.json 06/09/2024 08:48:26 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1300\generation_config.json 06/09/2024 08:48:35 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1300\model.safetensors 06/09/2024 08:48:35 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1300\tokenizer_config.json 06/09/2024 08:48:35 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1300\special_tokens_map.json 06/09/2024 08:50:43 - INFO - llamafactory.extras.callbacks - {'loss': 0.0071, 'learning_rate': 2.1472e-06, 'epoch': 30.26, 'throughput': 997.03} 06/09/2024 08:52:25 - INFO - llamafactory.extras.callbacks - {'loss': 0.0016, 'learning_rate': 2.0427e-06, 'epoch': 30.38, 'throughput': 997.31} 06/09/2024 08:54:28 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 1.9406e-06, 'epoch': 30.49, 'throughput': 997.40} 06/09/2024 08:56:57 - INFO - llamafactory.extras.callbacks - {'loss': 0.0017, 'learning_rate': 1.8411e-06, 'epoch': 30.61, 'throughput': 997.00} 06/09/2024 08:59:07 - INFO - llamafactory.extras.callbacks - {'loss': 0.0017, 'learning_rate': 1.7441e-06, 'epoch': 30.72, 'throughput': 996.94} 06/09/2024 09:01:08 - INFO - llamafactory.extras.callbacks - {'loss': 0.0019, 'learning_rate': 1.6496e-06, 'epoch': 30.84, 'throughput': 997.09} 06/09/2024 09:03:19 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 1.5577e-06, 'epoch': 30.96, 'throughput': 996.94} 06/09/2024 09:05:11 - INFO - llamafactory.extras.callbacks - {'loss': 0.0038, 'learning_rate': 1.4683e-06, 'epoch': 31.07, 'throughput': 997.13} 06/09/2024 09:06:56 - INFO - llamafactory.extras.callbacks - {'loss': 0.0052, 'learning_rate': 1.3815e-06, 'epoch': 31.19, 'throughput': 997.30} 06/09/2024 09:09:13 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 1.2972e-06, 'epoch': 31.30, 'throughput': 997.30} 06/09/2024 09:10:53 - INFO - llamafactory.extras.callbacks - {'loss': 0.0019, 'learning_rate': 1.2155e-06, 'epoch': 31.42, 'throughput': 997.55} 06/09/2024 09:12:40 - INFO - llamafactory.extras.callbacks - {'loss': 0.0019, 'learning_rate': 1.1365e-06, 'epoch': 31.54, 'throughput': 997.81} 06/09/2024 09:14:55 - INFO - llamafactory.extras.callbacks - {'loss': 0.0017, 'learning_rate': 1.0600e-06, 'epoch': 31.65, 'throughput': 997.48} 06/09/2024 09:16:55 - INFO - llamafactory.extras.callbacks - {'loss': 0.0022, 'learning_rate': 9.8612e-07, 'epoch': 31.77, 'throughput': 997.57} 06/09/2024 09:19:26 - INFO - llamafactory.extras.callbacks - {'loss': 0.0037, 'learning_rate': 9.1486e-07, 'epoch': 31.88, 'throughput': 997.25} 06/09/2024 09:21:45 - INFO - llamafactory.extras.callbacks - {'loss': 0.0019, 'learning_rate': 8.4624e-07, 'epoch': 32.00, 'throughput': 997.23} 06/09/2024 09:23:43 - INFO - llamafactory.extras.callbacks - {'loss': 0.0017, 'learning_rate': 7.8024e-07, 'epoch': 32.12, 'throughput': 997.32} 06/09/2024 09:25:59 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 7.1688e-07, 'epoch': 32.23, 'throughput': 997.25} 06/09/2024 09:28:28 - INFO - llamafactory.extras.callbacks - {'loss': 0.0035, 'learning_rate': 6.5617e-07, 'epoch': 32.35, 'throughput': 996.95} 06/09/2024 09:30:35 - INFO - llamafactory.extras.callbacks - {'loss': 0.0063, 'learning_rate': 5.9810e-07, 'epoch': 32.46, 'throughput': 996.92} 06/09/2024 09:30:35 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1400 06/09/2024 09:30:35 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1400\config.json 06/09/2024 09:30:35 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1400\generation_config.json 06/09/2024 09:30:41 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1400\model.safetensors 06/09/2024 09:30:41 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1400\tokenizer_config.json 06/09/2024 09:30:41 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1400\special_tokens_map.json 06/09/2024 09:32:30 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 5.4270e-07, 'epoch': 32.58, 'throughput': 996.79} 06/09/2024 09:34:17 - INFO - llamafactory.extras.callbacks - {'loss': 0.0020, 'learning_rate': 4.8996e-07, 'epoch': 32.70, 'throughput': 997.02} 06/09/2024 09:36:26 - INFO - llamafactory.extras.callbacks - {'loss': 0.0017, 'learning_rate': 4.3989e-07, 'epoch': 32.81, 'throughput': 997.12} 06/09/2024 09:38:35 - INFO - llamafactory.extras.callbacks - {'loss': 0.0016, 'learning_rate': 3.9250e-07, 'epoch': 32.93, 'throughput': 997.16} 06/09/2024 09:40:51 - INFO - llamafactory.extras.callbacks - {'loss': 0.0015, 'learning_rate': 3.4778e-07, 'epoch': 33.04, 'throughput': 997.07} 06/09/2024 09:42:47 - INFO - llamafactory.extras.callbacks - {'loss': 0.0057, 'learning_rate': 3.0575e-07, 'epoch': 33.16, 'throughput': 997.25} 06/09/2024 09:45:07 - INFO - llamafactory.extras.callbacks - {'loss': 0.0016, 'learning_rate': 2.6642e-07, 'epoch': 33.28, 'throughput': 996.81} 06/09/2024 09:46:46 - INFO - llamafactory.extras.callbacks - {'loss': 0.0016, 'learning_rate': 2.2977e-07, 'epoch': 33.39, 'throughput': 997.10} 06/09/2024 09:48:52 - INFO - llamafactory.extras.callbacks - {'loss': 0.0019, 'learning_rate': 1.9583e-07, 'epoch': 33.51, 'throughput': 997.13} 06/09/2024 09:50:50 - INFO - llamafactory.extras.callbacks - {'loss': 0.0017, 'learning_rate': 1.6458e-07, 'epoch': 33.62, 'throughput': 997.16} 06/09/2024 09:52:42 - INFO - llamafactory.extras.callbacks - {'loss': 0.0042, 'learning_rate': 1.3604e-07, 'epoch': 33.74, 'throughput': 997.29} 06/09/2024 09:54:48 - INFO - llamafactory.extras.callbacks - {'loss': 0.0017, 'learning_rate': 1.1022e-07, 'epoch': 33.86, 'throughput': 997.47} 06/09/2024 09:57:05 - INFO - llamafactory.extras.callbacks - {'loss': 0.0017, 'learning_rate': 8.7097e-08, 'epoch': 33.97, 'throughput': 997.29} 06/09/2024 09:59:36 - INFO - llamafactory.extras.callbacks - {'loss': 0.0015, 'learning_rate': 6.6693e-08, 'epoch': 34.09, 'throughput': 997.05} 06/09/2024 10:01:03 - INFO - llamafactory.extras.callbacks - {'loss': 0.0018, 'learning_rate': 4.9005e-08, 'epoch': 34.20, 'throughput': 997.33} 06/09/2024 10:03:24 - INFO - llamafactory.extras.callbacks - {'loss': 0.0024, 'learning_rate': 3.4034e-08, 'epoch': 34.32, 'throughput': 997.25} 06/09/2024 10:05:29 - INFO - llamafactory.extras.callbacks - {'loss': 0.0021, 'learning_rate': 2.1784e-08, 'epoch': 34.43, 'throughput': 997.41} 06/09/2024 10:07:24 - INFO - llamafactory.extras.callbacks - {'loss': 0.0035, 'learning_rate': 1.2254e-08, 'epoch': 34.55, 'throughput': 997.57} 06/09/2024 10:09:26 - INFO - llamafactory.extras.callbacks - {'loss': 0.0016, 'learning_rate': 5.4465e-09, 'epoch': 34.67, 'throughput': 997.75} 06/09/2024 10:11:31 - INFO - llamafactory.extras.callbacks - {'loss': 0.0015, 'learning_rate': 1.3617e-09, 'epoch': 34.78, 'throughput': 997.77} 06/09/2024 10:11:31 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1500 06/09/2024 10:11:31 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1500\config.json 06/09/2024 10:11:31 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1500\generation_config.json 06/09/2024 10:11:39 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1500\model.safetensors 06/09/2024 10:11:39 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1500\tokenizer_config.json 06/09/2024 10:11:39 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\checkpoint-1500\special_tokens_map.json 06/09/2024 10:13:48 - INFO - llamafactory.extras.callbacks - {'loss': 0.0053, 'learning_rate': 0.0000e+00, 'epoch': 34.90, 'throughput': 997.38} 06/09/2024 10:13:48 - INFO - transformers.trainer - Training completed. Do not forget to share your model on huggingface.co/models =) 06/09/2024 10:13:48 - INFO - transformers.trainer - Saving model checkpoint to saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14 06/09/2024 10:13:48 - INFO - transformers.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\config.json 06/09/2024 10:13:48 - INFO - transformers.generation.configuration_utils - Configuration saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\generation_config.json 06/09/2024 10:13:53 - INFO - transformers.modeling_utils - Model weights saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\model.safetensors 06/09/2024 10:13:53 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\tokenizer_config.json 06/09/2024 10:13:53 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves\Qwen2-0.5B\full\train_2024-06-08-23-23-14\special_tokens_map.json 06/09/2024 10:13:53 - WARNING - llamafactory.extras.ploting - No metric eval_loss to plot. 06/09/2024 10:13:53 - INFO - transformers.modelcard - Dropping the following result as it does not have all the necessary fields: {'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}