oMateos2020

Sep 13, 2023

•

edited Sep 14, 2023

Hi everyone,
I am trying to use flair/ner-spanish-large model for extracting several entities from spanish sentences. I tested it in the Hosted inference API and liked the results, so I did a notebook to test it locally, but I get the next error:

RuntimeError Traceback (most recent call last)
Cell In[4], line 2
1 # load tagger
----> 2 tagger = SequenceTagger.load("flair/ner-spanish-large")
4 # make example sentence
5 sentence = Sentence("George Washington fue a Washington. ")

File ~.conda\envs\flair_0_12_2\lib\site-packages\flair\models\sequence_tagger_model.py:1035, in SequenceTagger.load(cls, model_path)
1031 @classmethod
1032 def load(cls, model_path: Union[str, Path, Dict[str, Any]]) -> "SequenceTagger":
1033 from typing import cast
-> 1035 return cast("SequenceTagger", super().load(model_path=model_path))

File ~.conda\envs\flair_0_12_2\lib\site-packages\flair\nn\model.py:559, in Classifier.load(cls, model_path)
555 @classmethod
556 def load(cls, model_path: Union[str, Path, Dict[str, Any]]) -> "Classifier":
557 from typing import cast
--> 559 return cast("Classifier", super().load(model_path=model_path))

File ~.conda\envs\flair_0_12_2\lib\site-packages\flair\nn\model.py:191, in Model.load(cls, model_path)
189 if not isinstance(model_path, dict):
190 model_file = cls._fetch_model(str(model_path))
--> 191 state = load_torch_state(model_file)
192 else:
193 state = model_path

File ~.conda\envs\flair_0_12_2\lib\site-packages\flair\file_utils.py:359, in load_torch_state(model_file)
355 # load_big_file is a workaround byhttps://github.com/highway11git
356 # to load models on some Mac/Windows setups
357 # see https://github.com/zalandoresearch/flair/issues/351
358 f = load_big_file(model_file)
--> 359 return torch.load(f, map_location="cpu")

File ~.conda\envs\flair_0_12_2\lib\site-packages\torch\serialization.py:809, in load(f, map_location, pickle_module, weights_only, **pickle_load_args)
807 except RuntimeError as e:
808 raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None
--> 809 return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
810 if weights_only:
811 try:

File ~.conda\envs\flair_0_12_2\lib\site-packages\torch\serialization.py:1172, in _load(zip_file, map_location, pickle_module, pickle_file, **pickle_load_args)
1170 unpickler = UnpicklerWrapper(data_file, **pickle_load_args)
1171 unpickler.persistent_load = persistent_load
-> 1172 result = unpickler.load()
1174 torch._utils._validate_loaded_sparse_tensors()
1176 return result

File ~.conda\envs\flair_0_12_2\lib\site-packages\flair\embeddings\transformer.py:1169, in TransformerEmbeddings.setstate(self, state)
1166 self.dict[key] = embedding.dict[key]
1168 if model_state_dict:
-> 1169 self.model.load_state_dict(model_state_dict)

File ~.conda\envs\flair_0_12_2\lib\site-packages\torch\nn\modules\module.py:2041, in Module.load_state_dict(self, state_dict, strict)
2036 error_msgs.insert(
2037 0, 'Missing key(s) in state_dict: {}. '.format(
2038 ', '.join('"{}"'.format(k) for k in missing_keys)))
2040 if len(error_msgs) > 0:
-> 2041 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
2042 self.class.name, "\n\t".join(error_msgs)))
2043 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for XLMRobertaModel:
Unexpected key(s) in state_dict: "embeddings.position_ids".

My code is the same from the demo:

from flair.data import Sentence
from flair.models import SequenceTagger

load tagger

tagger = SequenceTagger.load("flair/ner-spanish-large")

make example sentence

sentence = Sentence("George Washington fue a Washington")

predict NER tags

tagger.predict(sentence)

print sentence

print(sentence)

And finally I tried to force to reinstall the dependecies as per the requirements file from the github repo which are as follows, but the RuntimeError from the keys from state_dict for XLMRobertaModel persist the same as above, here the lines from the requirements file:

boto3>=1.20.27
bpemb>=0.3.2
conllu>=4.0
deprecated>=1.2.13
ftfy>=6.1.0
gdown>=4.4.0
gensim>=4.2.0
huggingface-hub>=0.10.0
janome>=0.4.2
langdetect>=1.0.9
lxml>=4.8.0
matplotlib>=2.2.3
more-itertools>=8.13.0
mpld3>=0.3
pptree>=3.1
python-dateutil>=2.8.2
pytorch_revgrad>=0.2.0
regex>=2022.1.18
scikit-learn>=1.0.2
segtok>=1.5.11
sqlitedict>=2.0.0
tabulate>=0.8.10
torch>=1.5.0,!=1.8
tqdm>=4.63.0
transformer-smaller-training-vocab>=0.2.3
transformers[sentencepiece]>=4.18.0,<5.0.0
urllib3<2.0.0,>=1.0.0 # pin below 2 to make dependency resolution faster.
wikipedia-api>=0.5.7
semver<4.0.0,>=3.0.0

Do you have any idea on why am I getting this error and if there is a way to make this working? I am under Windows 10 for job requirements, but I have tried also to run it in Google Colab getting the same error. ( https://colab.research.google.com/drive/1gEXkxDvK2MAY1ztGrN5z9uE0pzi6BAcs?usp=sharing ).

Any help would be highly appreciated. Thanks in advance.