Enoch/llama-65b-hf · 预测报错

预测的时候报错：RuntimeError: shape '[-1, 271]' is invalid for input of size 568

(llama) huawei@work-3:~/workspace/LLaMA-Efficient-Tuning-main$ NCCL_SOCKET_IFNAME=eth1 NCCL_DEBUG=INFO python src/train_bash.py --stage sft --model_name_or_path model/llama65b/ --do_predict --dataset alpaca_zh --finetuning_type lora --checkpoint_dir path_to_sft_checkpoint_llama65B/ --output_dir path_to_predict_result --per_device_train_batch_size 1 --prompt_template default --lora_target W_pack --predict_with_generate --max_samples 20
[2023-08-07 02:42:01,059] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
08/07/2023 02:42:03 - WARNING - llmtuner.tuner.core.parser - Please specify prompt_template if you are using other pre-trained models.
08/07/2023 02:42:03 - WARNING - llmtuner.tuner.core.parser - ddp_find_unused_parameters needs to be set as False in DDP training.
08/07/2023 02:42:03 - INFO - llmtuner.tuner.core.parser - Process rank: 0, device: cuda:0, n_gpu: 8
distributed training: True, 16-bits training: False
08/07/2023 02:42:03 - INFO - llmtuner.tuner.core.parser - Training/evaluation parameters Seq2SeqTrainingArguments(
_n_gpu=8,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_backend=None,
ddp_broadcast_buffers=None,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=False,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
do_eval=False,
do_predict=True,
do_train=False,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=None,
evaluation_strategy=no,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_config={'fsdp_min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
generation_config=None,
generation_max_length=None,
generation_num_beams=None,
gradient_accumulation_steps=1,
gradient_checkpointing=False,
greater_is_better=None,
group_by_length=False,
half_precision_backend=auto,
hub_model_id=None,
hub_private_repo=False,
hub_strategy=every_save,
hub_token=,
ignore_data_skip=False,
include_inputs_for_metrics=False,
jit_mode_eval=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=5e-05,
length_column_name=length,
load_best_model_at_end=False,
local_rank=0,
log_level=passive,
log_level_replica=warning,
log_on_each_node=True,
logging_dir=path_to_predict_result/runs/Aug07_02-42-03_work-3,
logging_first_step=False,
logging_nan_inf_filter=True,
logging_steps=500,
logging_strategy=steps,
lr_scheduler_type=linear,
max_grad_norm=1.0,
max_steps=-1,
metric_for_best_model=None,
mp_parameters=,
no_cuda=False,
num_train_epochs=3.0,
optim=adamw_torch,
optim_args=None,
output_dir=path_to_predict_result,
overwrite_output_dir=False,
past_index=-1,
per_device_eval_batch_size=8,
per_device_train_batch_size=1,
predict_with_generate=True,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=,
ray_scope=last,
remove_unused_columns=True,
report_to=['wandb'],
resume_from_checkpoint=None,
run_name=path_to_predict_result,
save_on_each_node=False,
save_safetensors=False,
save_steps=500,
save_strategy=steps,
save_total_limit=None,
seed=42,
sharded_ddp=[],
skip_memory_metrics=True,
sortish_sampler=False,
tf32=None,
torch_compile=False,
torch_compile_backend=None,
torch_compile_mode=None,
torchdynamo=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_ipex=False,
use_legacy_prediction_loop=False,
use_mps_device=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
xpu_backend=None,
)
08/07/2023 02:42:03 - INFO - llmtuner.dsets.loader - Loading dataset alpaca_data_zh_51k.json...
/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/datasets/load.py:2069: FutureWarning: 'use_auth_token' was deprecated in favor of 'token' in version 2.14.0 and will be removed in 3.0.0.
You can remove this warning by passing 'token=None' instead.
warnings.warn(
Using custom data configuration default-f85e68495e5d6806
Loading Dataset Infos from /home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/datasets/packaged_modules/json
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /home/huawei/.cache/huggingface/datasets/json/default-f85e68495e5d6806/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96
Found cached dataset json (/home/huawei/.cache/huggingface/datasets/json/default-f85e68495e5d6806/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96)
Loading Dataset info from /home/huawei/.cache/huggingface/datasets/json/default-f85e68495e5d6806/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96
[INFO|tokenization_utils_base.py:1837] 2023-08-07 02:42:03,796 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:1837] 2023-08-07 02:42:03,796 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:1837] 2023-08-07 02:42:03,796 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:1837] 2023-08-07 02:42:03,796 >> loading file tokenizer_config.json
[WARNING|logging.py:295] 2023-08-07 02:42:03,797 >> You are using the legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at https://github.com/huggingface/transformers/pull/24565
[INFO|configuration_utils.py:710] 2023-08-07 02:42:03,813 >> loading configuration file model/llama65b/config.json
[INFO|configuration_utils.py:768] 2023-08-07 02:42:03,815 >> Model config LlamaConfig {
"_name_or_path": "model/llama65b/",
"architectures": [
"LLaMAForCausalLM"
],
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 8192,
"initializer_range": 0.02,
"intermediate_size": 22016,
"max_position_embeddings": 2048,
"max_sequence_length": 2048,
"model_type": "llama",
"num_attention_heads": 64,
"num_hidden_layers": 80,
"num_key_value_heads": 64,
"pad_token_id": 0,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.31.0",
"use_cache": true,
"vocab_size": 32000
}

[INFO|modeling_utils.py:2600] 2023-08-07 02:42:03,841 >> loading weights file model/llama65b/pytorch_model.bin.index.json
[INFO|modeling_utils.py:1172] 2023-08-07 02:42:03,842 >> Instantiating LlamaForCausalLM model under default dtype torch.float16.
[INFO|configuration_utils.py:599] 2023-08-07 02:42:03,843 >> Generate config GenerationConfig {
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"pad_token_id": 0,
"transformers_version": "4.31.0"
}

Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 81/81 [01:53<00:00, 1.40s/it]
[INFO|modeling_utils.py:3329] 2023-08-07 02:44:11,321 >> All model checkpoint weights were used when initializing LlamaForCausalLM.

[INFO|modeling_utils.py:3337] 2023-08-07 02:44:11,321 >> All the weights of LlamaForCausalLM were initialized from the model checkpoint at model/llama65b/.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
[INFO|configuration_utils.py:559] 2023-08-07 02:44:11,329 >> loading configuration file model/llama65b/generation_config.json
[INFO|configuration_utils.py:599] 2023-08-07 02:44:11,330 >> Generate config GenerationConfig {
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"pad_token_id": 0,
"transformers_version": "4.31.0"
}

08/07/2023 02:44:11 - INFO - llmtuner.tuner.core.adapter - Fine-tuning method: LoRA
08/07/2023 02:48:58 - INFO - llmtuner.tuner.core.adapter - Merged 1 model checkpoint(s).
08/07/2023 02:48:58 - INFO - llmtuner.tuner.core.adapter - Loaded fine-tuned model from checkpoint(s): path_to_sft_checkpoint_llama65B/
trainable params: 0 || all params: 65285660672 || trainable%: 0.0000
Loading cached processed dataset at /home/huawei/.cache/huggingface/datasets/json/default-f85e68495e5d6806/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96/cache-80195560bd98b11e.arrow
input_ids:
[0, 319, 13563, 1546, 263, 12758, 1404, 322, 385, 23116, 21082, 20255, 29889, 450, 20255, 4076, 8444, 29892, 13173, 29892, 322, 1248, 568, 6089, 304, 278, 1404, 29915, 29879, 5155, 29889, 13, 29950, 7889, 29901, 29871, 31025, 231, 187, 193, 31076, 30210, 31222, 31433, 30872, 31466, 30210, 30457, 30502, 31141, 232, 193, 132, 30267, 13, 7900, 22137, 29901, 29871]
inputs:
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: 列举好的网站设计的三个特征。
Assistant:
label_ids:
[0, 29871, 31076, 30210, 31222, 31433, 30872, 31466, 30210, 30457, 30502, 31141, 232, 193, 132, 30392, 30406, 31229, 31373, 31076, 30952, 30214, 30989, 233, 156, 179, 233, 155, 150, 233, 138, 133, 30210, 31943, 31727, 31320, 31901, 30503, 31568, 235, 170, 140, 232, 147, 187, 31674, 31074, 30267, 30872, 31466, 31370, 31751, 236, 149, 139, 30783, 30895, 31062, 232, 146, 154, 231, 191, 154, 30214, 31666, 31320, 30733, 30785, 30406, 31229, 30815, 232, 167, 162, 31157, 235, 170, 133, 30533, 31943, 31727, 30503, 232, 194, 174, 31859, 235, 177, 194, 31658, 30728, 31294, 30210, 30824, 31605, 30267, 30630, 235, 170, 133, 30210, 30630, 30415, 30998, 31441, 31420, 31100, 233, 135, 140, 233, 133, 169, 30503, 232, 147, 187, 31674, 30313, 30210, 30988, 236, 173, 143, 30267]
labels:
好的网站设计的三个特征是用户友好性，清晰易懂的导航结构和视觉吸引力。设计应该针对目标受众，并结合使用户能够直观地导航和快速访问内容的元素。美观的美学将创造更愉悦和吸引人的体验。
[INFO|trainer.py:386] 2023-08-07 02:48:58,475 >> You have loaded a model on multiple GPUs. is_model_parallel attribute will be force-set to True to avoid any unexpected behavior such as device placement mismatching.
[INFO|trainer.py:3081] 2023-08-07 02:48:58,478 >> ***** Running Prediction *****
[INFO|trainer.py:3083] 2023-08-07 02:48:58,478 >> Num examples = 20
[INFO|trainer.py:3086] 2023-08-07 02:48:58,478 >> Batch size = 8
[INFO|configuration_utils.py:599] 2023-08-07 02:48:58,493 >> Generate config GenerationConfig {
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"pad_token_id": 0,
"transformers_version": "4.31.0"
}

Traceback (most recent call last):
File "/home/huawei/workspace/LLaMA-Efficient-Tuning-main/src/train_bash.py", line 45, in
main()
File "/home/huawei/workspace/LLaMA-Efficient-Tuning-main/src/train_bash.py", line 12, in main
run_sft(model_args, data_args, training_args, finetuning_args)
File "/home/huawei/workspace/LLaMA-Efficient-Tuning-main/src/llmtuner/tuner/sft/workflow.py", line 89, in run_sft
predict_results = trainer.predict(dataset, metric_key_prefix="predict", **gen_kwargs)
File "/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/trainer_seq2seq.py", line 216, in predict
return super().predict(test_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
File "/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/trainer.py", line 3010, in predict
output = eval_loop(
File "/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/trainer.py", line 3123, in evaluation_loop
loss, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys)
File "/home/huawei/workspace/LLaMA-Efficient-Tuning-main/src/llmtuner/tuner/sft/trainer.py", line 40, in prediction_step
loss, generated_tokens, labels = super().prediction_step(
File "/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/trainer_seq2seq.py", line 282, in prediction_step
generated_tokens = self.model.generate(**inputs, **gen_kwargs)
File "/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate
return self.sample(
File "/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/generation/utils.py", line 2642, in sample
outputs = self(
File "/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 806, in forward
outputs = self.model(
File "/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/huawei/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 643, in forward
position_ids = position_ids.view(-1, seq_length).long()
RuntimeError: shape '[-1, 271]' is invalid for input of size 568