Model Specification

  • This is the state-of-the-art Twitter NER model (with 74.35% Entity-Level F1) on Tweebank V2's NER benchmark (also called Tweebank-NER), trained on the corpus combining both Tweebank-NER and WNUT 17 training data.
  • For more details about the TweebankNLP project, please refer to this our paper and github page.
  • In the paper, it is referred as HuggingFace-BERTweet (TB2+W17).

How to use the model

  • PRE-PROCESSING: when you apply the model on tweets, please make sure that tweets are preprocessed by the TweetTokenizer to get the best performance.
from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("TweebankNLP/bertweet-tb2_wnut17-ner")

model = AutoModelForTokenClassification.from_pretrained("TweebankNLP/bertweet-tb2_wnut17-ner")

References

If you use this repository in your research, please kindly cite our paper:

@article{jiang2022tweetnlp,
    title={Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis},
    author={Jiang, Hang and Hua, Yining and Beeferman, Doug and Roy, Deb},
    journal={In Proceedings of the 13th Language Resources and Evaluation Conference (LREC)},
    year={2022}
}
Downloads last month
254
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.