推理代码有两个问题,self.vocab 和 model.generate

#6
by twwch - opened

Traceback (most recent call last):
File "/data/chenhao/codes/kmai-model/src/test.py", line 7, in
tokenizer = PegasusTokenizer.from_pretrained("/data/models/kmai/IDEA-CCNL/Randeng-Pegasus-523M-Summary-Chinese")
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2045, in from_pretrained
return cls._from_pretrained(
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2256, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/data/chenhao/codes/kmai-model/src/tokenizers_pegasus.py", line 154, in init
super().init(
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 366, in init
self._add_tokens(self.all_special_tokens_extended, special_tokens=True)
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 462, in _add_tokens
current_vocab = self.get_vocab().copy()
File "/data/chenhao/codes/kmai-model/src/tokenizers_pegasus.py", line 207, in get_vocab
return dict(self.vocab, **self.added_tokens_encoder)
AttributeError: 'PegasusTokenizer' object has no attribute 'vocab'

Traceback (most recent call last):
File "/data/chenhao/codes/kmai-model/src/test.py", line 12, in
summary_ids = model.generate(inputs["input_ids"])
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/transformers/generation/utils.py", line 1496, in generate
model_kwargs = self._prepare_encoder_decoder_kwargs_for_generation(
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/transformers/generation/utils.py", line 661, in _prepare_encoder_decoder_kwargs_for_generation
model_kwargs["encoder_outputs"]: ModelOutput = encoder(**encoder_kwargs)
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/transformers/models/pegasus/modeling_pegasus.py", line 769, in forward
inputs_embeds = self.embed_tokens(input_ids) * self.embed_scale
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 162, in forward
return F.embedding(
File "/data/chenhao/anaconda/envs/kmai/lib/python3.10/site-packages/torch/nn/functional.py", line 2233, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

twwch changed discussion title from 推理代码有两个问题, to 推理代码有两个问题,self.vocab 和 model.generate

我也不行,这个模型到底能不能跑的通???

This comment has been hidden

Same issue,
Either
File "/opt/homebrew/lib/python3.8/site-packages/sentencepiece/init.py", line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
TypeError: not a string

or

File "/Users/username/Path.../tokenizers_pegasus.py", line 209, in get_vocab
return dict(self.vocab, **self.added_tokens_encoder)
AttributeError: 'PegasusTokenizer' object has no attribute 'vocab'

The model is not runnable ?

yes, can't run!

Exception has occurred: AttributeError
'PegasusTokenizer' object has no attribute 'vocab'
File "D:\LOCALllm\summarize\tokenizers_pegasus.py", line 209, in get_vocab
return dict(self.vocab, **self.added_tokens_encoder)
^^^^^^^^^^
File "D:\LOCALllm\summarize\tokenizers_pegasus.py", line 157, in init
super().init(
File "D:\LOCALllm\summarize\query_my_pdf000003.py", line 20, in
tokenizer = PegasusTokenizer.from_pretrained("D:\LOCALllm\summarize\IDEA-CCNL--Randeng-Pegasus-523M-Summary-Chinese")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'PegasusTokenizer' object has no attribute 'vocab'

Sign up or log in to comment