dslim/bert-base-NER · Significantly different results when using Inference API vs Endpoints or transformers

Creo

May 9, 2023

Currently, the Inference API produces better results than other methods, but it's pretty slow.

Can you please check if there's the same model behind these methods?

Thanks a lot!

tobiasmarie

May 13, 2023

•

edited May 13, 2023

Same for me. The model from
"""
model = AutoModelForTokenClassification.from_pretrained("dslim/bert-base-NER")
"""
is performing weird. For one word "WeChat", it give results of three:

{'entity': 'B-ORG', 'score': 0.88335264, 'index': 241, 'word': 'We', 'start': 964, 'end': 966},
{'entity': 'I-ORG', 'score': 0.85090846, 'index': 242, 'word': '##C', 'start': 966, 'end': 967},
{'entity': 'I-ORG', 'score': 0.5859645, 'index': 243, 'word': '##hat', 'start': 967, 'end': 970}

Is the API on the model page somehow processing the results by the "start" and "end" index?

KingTechnician

Feb 24, 2024

Hello,

Unsure if you all are still encountering this issue, but I found a way to get the same results as the model page for the pipeline by setting grouped_entites=True.

So basically:

ner_pipe = pipeline("ner",tokenizer=k_tokenizer,model=k_model,grouped_entities=True)

It helped me, hopefully it helps you!

prtm

Feb 29, 2024

@KingTechnician Worked, thanks!