QwenTT-0.5B-INT8 / running_log.txt

Upload 17 files

874201c verified 12 months ago

7.85 kB

	05/19/2024 22:55:50 - INFO - transformers.tokenization_utils_base - loading file vocab.json from cache at /root/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B-Chat/snapshots/4d14e384a4b037942bb3f3016665157c8bcb70ea/vocab.json

	05/19/2024 22:55:50 - INFO - transformers.tokenization_utils_base - loading file merges.txt from cache at /root/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B-Chat/snapshots/4d14e384a4b037942bb3f3016665157c8bcb70ea/merges.txt

	05/19/2024 22:55:50 - INFO - transformers.tokenization_utils_base - loading file tokenizer.json from cache at /root/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B-Chat/snapshots/4d14e384a4b037942bb3f3016665157c8bcb70ea/tokenizer.json

	05/19/2024 22:55:50 - INFO - transformers.tokenization_utils_base - loading file added_tokens.json from cache at None

	05/19/2024 22:55:50 - INFO - transformers.tokenization_utils_base - loading file special_tokens_map.json from cache at None

	05/19/2024 22:55:50 - INFO - transformers.tokenization_utils_base - loading file tokenizer_config.json from cache at /root/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B-Chat/snapshots/4d14e384a4b037942bb3f3016665157c8bcb70ea/tokenizer_config.json

	05/19/2024 22:55:51 - WARNING - transformers.tokenization_utils_base - Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

	05/19/2024 22:55:51 - INFO - llamafactory.data.template - Replace eos token: <\|im_end\|>

	05/19/2024 22:55:51 - INFO - llamafactory.data.loader - Loading dataset identity.json...

	05/19/2024 22:55:58 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B-Chat/snapshots/4d14e384a4b037942bb3f3016665157c8bcb70ea/config.json

	05/19/2024 22:55:58 - INFO - transformers.configuration_utils - Model config Qwen2Config {
	"_name_or_path": "Qwen/Qwen1.5-0.5B-Chat",
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1024,
	"initializer_range": 0.02,
	"intermediate_size": 2816,
	"max_position_embeddings": 32768,
	"max_window_layers": 21,
	"model_type": "qwen2",
	"num_attention_heads": 16,
	"num_hidden_layers": 24,
	"num_key_value_heads": 16,
	"rms_norm_eps": 1e-06,
	"rope_theta": 1000000.0,
	"sliding_window": 32768,
	"tie_word_embeddings": true,
	"torch_dtype": "bfloat16",
	"transformers_version": "4.40.2",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}


	05/19/2024 22:55:58 - INFO - llamafactory.model.utils.quantization - Quantizing model to 8 bit.

	05/19/2024 22:55:58 - INFO - transformers.modeling_utils - loading weights file model.safetensors from cache at /root/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B-Chat/snapshots/4d14e384a4b037942bb3f3016665157c8bcb70ea/model.safetensors

	05/19/2024 22:55:58 - INFO - transformers.modeling_utils - Instantiating Qwen2ForCausalLM model under default dtype torch.float16.

	05/19/2024 22:55:58 - INFO - transformers.generation.configuration_utils - Generate config GenerationConfig {
	"bos_token_id": 151643,
	"eos_token_id": 151645
	}


	05/19/2024 22:56:02 - INFO - transformers.modeling_utils - All model checkpoint weights were used when initializing Qwen2ForCausalLM.


	05/19/2024 22:56:02 - INFO - transformers.modeling_utils - All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at Qwen/Qwen1.5-0.5B-Chat.
	If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.

	05/19/2024 22:56:02 - INFO - transformers.generation.configuration_utils - loading configuration file generation_config.json from cache at /root/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B-Chat/snapshots/4d14e384a4b037942bb3f3016665157c8bcb70ea/generation_config.json

	05/19/2024 22:56:02 - INFO - transformers.generation.configuration_utils - Generate config GenerationConfig {
	"bos_token_id": 151643,
	"do_sample": true,
	"eos_token_id": [
	151645,
	151643
	],
	"pad_token_id": 151643,
	"repetition_penalty": 1.1,
	"top_p": 0.8
	}


	05/19/2024 22:56:02 - INFO - llamafactory.model.utils.checkpointing - Gradient checkpointing enabled.

	05/19/2024 22:56:02 - INFO - llamafactory.model.utils.attention - Using torch SDPA for faster training and inference.

	05/19/2024 22:56:02 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.

	05/19/2024 22:56:02 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA

	05/19/2024 22:56:02 - INFO - llamafactory.model.loader - trainable params: 786432 \|\| all params: 464774144 \|\| trainable%: 0.1692

	05/19/2024 22:56:02 - INFO - transformers.trainer - Using auto half precision backend

	05/19/2024 22:56:03 - INFO - transformers.trainer - *** Running training ***

	05/19/2024 22:56:03 - INFO - transformers.trainer - Num examples = 91

	05/19/2024 22:56:03 - INFO - transformers.trainer - Num Epochs = 3

	05/19/2024 22:56:03 - INFO - transformers.trainer - Instantaneous batch size per device = 2

	05/19/2024 22:56:03 - INFO - transformers.trainer - Total train batch size (w. parallel, distributed & accumulation) = 16

	05/19/2024 22:56:03 - INFO - transformers.trainer - Gradient Accumulation steps = 8

	05/19/2024 22:56:03 - INFO - transformers.trainer - Total optimization steps = 15

	05/19/2024 22:56:03 - INFO - transformers.trainer - Number of trainable parameters = 786,432

	05/19/2024 22:56:30 - INFO - llamafactory.extras.callbacks - {'loss': 3.4258, 'learning_rate': 3.7500e-05, 'epoch': 0.87}

	05/19/2024 22:56:57 - INFO - llamafactory.extras.callbacks - {'loss': 3.3578, 'learning_rate': 1.2500e-05, 'epoch': 1.74}

	05/19/2024 22:57:24 - INFO - llamafactory.extras.callbacks - {'loss': 3.2979, 'learning_rate': 0.0000e+00, 'epoch': 2.61}

	05/19/2024 22:57:24 - INFO - transformers.trainer -

	Training completed. Do not forget to share your model on huggingface.co/models =)



	05/19/2024 22:57:24 - INFO - transformers.trainer - Saving model checkpoint to saves/Qwen1.5-0.5B-Chat/lora/QwenTT-0.5B-INT8

	05/19/2024 22:57:25 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B-Chat/snapshots/4d14e384a4b037942bb3f3016665157c8bcb70ea/config.json

	05/19/2024 22:57:25 - INFO - transformers.configuration_utils - Model config Qwen2Config {
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1024,
	"initializer_range": 0.02,
	"intermediate_size": 2816,
	"max_position_embeddings": 32768,
	"max_window_layers": 21,
	"model_type": "qwen2",
	"num_attention_heads": 16,
	"num_hidden_layers": 24,
	"num_key_value_heads": 16,
	"rms_norm_eps": 1e-06,
	"rope_theta": 1000000.0,
	"sliding_window": 32768,
	"tie_word_embeddings": true,
	"torch_dtype": "bfloat16",
	"transformers_version": "4.40.2",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}


	05/19/2024 22:57:25 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Qwen1.5-0.5B-Chat/lora/QwenTT-0.5B-INT8/tokenizer_config.json

	05/19/2024 22:57:25 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Qwen1.5-0.5B-Chat/lora/QwenTT-0.5B-INT8/special_tokens_map.json

	05/19/2024 22:57:25 - WARNING - llamafactory.extras.ploting - No metric eval_loss to plot.

	05/19/2024 22:57:25 - INFO - transformers.modelcard - Dropping the following result as it does not have all the necessary fields:
	{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}