phi-2-sft-alpaca_gpt4_en-ep1 / export_log.txt

yhyu13

Upload

d927329 12 months ago

4.98 kB

	/home/hangyu5/anaconda3/envs/llama_factory/lib/python3.11/site-packages/trl/trainer/ppo_config.py:141: UserWarning: The `optimize_cuda_cache` arguement will be deprecated soon, please use `optimize_device_cache` instead.
	warnings.warn(
	[INFO\|tokenization_utils_base.py:2024] 2023-12-19 12:57:02,821 >> loading file vocab.json
	[INFO\|tokenization_utils_base.py:2024] 2023-12-19 12:57:02,821 >> loading file merges.txt
	[INFO\|tokenization_utils_base.py:2024] 2023-12-19 12:57:02,821 >> loading file tokenizer.json
	[INFO\|tokenization_utils_base.py:2024] 2023-12-19 12:57:02,821 >> loading file added_tokens.json
	[INFO\|tokenization_utils_base.py:2024] 2023-12-19 12:57:02,821 >> loading file special_tokens_map.json
	[INFO\|tokenization_utils_base.py:2024] 2023-12-19 12:57:02,821 >> loading file tokenizer_config.json
	[WARNING\|logging.py:314] 2023-12-19 12:57:02,875 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
	[INFO\|configuration_utils.py:737] 2023-12-19 12:57:02,875 >> loading configuration file ./models/phi-2/config.json
	[INFO\|configuration_utils.py:737] 2023-12-19 12:57:02,877 >> loading configuration file ./models/phi-2/config.json
	[INFO\|configuration_utils.py:802] 2023-12-19 12:57:02,877 >> Model config PhiConfig {
	"_name_or_path": "./models/phi-2",
	"activation_function": "gelu_new",
	"architectures": [
	"PhiForCausalLM"
	],
	"attn_pdrop": 0.0,
	"auto_map": {
	"AutoConfig": "configuration_phi.PhiConfig",
	"AutoModelForCausalLM": "modeling_phi.PhiForCausalLM"
	},
	"embd_pdrop": 0.0,
	"flash_attn": false,
	"flash_rotary": false,
	"fused_dense": false,
	"img_processor": null,
	"initializer_range": 0.02,
	"layer_norm_epsilon": 1e-05,
	"model_type": "phi-msft",
	"n_embd": 2560,
	"n_head": 32,
	"n_head_kv": null,
	"n_inner": null,
	"n_layer": 32,
	"n_positions": 2048,
	"resid_pdrop": 0.1,
	"rotary_dim": 32,
	"tie_word_embeddings": false,
	"torch_dtype": "float16",
	"transformers_version": "4.36.1",
	"vocab_size": 51200
	}

	[INFO\|modeling_utils.py:3329] 2023-12-19 12:57:02,908 >> loading weights file ./models/phi-2/model.safetensors.index.json
	[INFO\|modeling_utils.py:1341] 2023-12-19 12:57:02,908 >> Instantiating PhiForCausalLM model under default dtype torch.float16.
	[INFO\|configuration_utils.py:826] 2023-12-19 12:57:02,909 >> Generate config GenerationConfig {}

	[INFO\|configuration_utils.py:826] 2023-12-19 12:57:02,909 >> Generate config GenerationConfig {}

	Loading checkpoint shards: 0%\| \| 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%\|█████ \| 1/2 [00:00<00:00, 5.79it/s] Loading checkpoint shards: 100%\|██████████\| 2/2 [00:00<00:00, 6.04it/s] Loading checkpoint shards: 100%\|██████████\| 2/2 [00:00<00:00, 6.00it/s]
	[INFO\|modeling_utils.py:4173] 2023-12-19 12:57:03,336 >> All model checkpoint weights were used when initializing PhiForCausalLM.

	[INFO\|modeling_utils.py:4181] 2023-12-19 12:57:03,336 >> All the weights of PhiForCausalLM were initialized from the model checkpoint at ./models/phi-2.
	If your task is similar to the task the model of the checkpoint was trained on, you can already use PhiForCausalLM for predictions without further training.
	[INFO\|configuration_utils.py:779] 2023-12-19 12:57:03,338 >> loading configuration file ./models/phi-2/generation_config.json
	[INFO\|configuration_utils.py:826] 2023-12-19 12:57:03,339 >> Generate config GenerationConfig {}

	12/19/2023 12:57:03 - INFO - llmtuner.model.adapter - Fine-tuning method: LoRA
	12/19/2023 12:57:04 - INFO - llmtuner.model.adapter - Merged 1 adapter(s).
	12/19/2023 12:57:04 - INFO - llmtuner.model.adapter - Loaded adapter(s): ./models/sft/phi-2-sft-alpaca_gpt4_en-ep1-lora
	12/19/2023 12:57:04 - INFO - llmtuner.model.loader - trainable params: 0 \|\| all params: 2779683840 \|\| trainable%: 0.0000
	12/19/2023 12:57:04 - INFO - llmtuner.model.loader - This IS expected that the trainable params is 0 if you are using model for inference only.
	[INFO\|configuration_utils.py:483] 2023-12-19 12:57:04,105 >> Configuration saved in ./models/export/phi-2-sft-alpaca_gpt4_en-ep1/config.json
	[INFO\|configuration_utils.py:594] 2023-12-19 12:57:04,105 >> Configuration saved in ./models/export/phi-2-sft-alpaca_gpt4_en-ep1/generation_config.json
	[INFO\|modeling_utils.py:2390] 2023-12-19 12:57:10,095 >> The model is bigger than the maximum size per checkpoint (5GB) and is going to be split in 2 checkpoint shards. You can find where each parameters has been saved in the index located at ./models/export/phi-2-sft-alpaca_gpt4_en-ep1/model.safetensors.index.json.
	[INFO\|tokenization_utils_base.py:2432] 2023-12-19 12:57:10,096 >> tokenizer config file saved in ./models/export/phi-2-sft-alpaca_gpt4_en-ep1/tokenizer_config.json
	[INFO\|tokenization_utils_base.py:2441] 2023-12-19 12:57:10,097 >> Special tokens file saved in ./models/export/phi-2-sft-alpaca_gpt4_en-ep1/special_tokens_map.json