cardiffnlp/twitter-roberta-base-2021-124m-topic-multi

This model is a fine-tuned version of cardiffnlp/twitter-roberta-base-2021-124m on the cardiffnlp/tweet_topic_multi via tweetnlp. Training split is train_all and parameters have been tuned on the validation split validation_2021.

Following metrics are achieved on the test split test_2021 (link).

  • F1 (micro): 0.7528230865746549
  • F1 (macro): 0.5564228688431104
  • Accuracy: 0.535437760571769

Usage

Install tweetnlp via pip.

pip install tweetnlp

Load the model in python.

import tweetnlp
model = tweetnlp.Classifier("cardiffnlp/twitter-roberta-base-2021-124m-topic-multi", max_length=128)
model.predict('Get the all-analog Classic Vinyl Edition of "Takin Off" Album from {@herbiehancock@} via {@bluenoterecords@} link below {{URL}}')

Reference

@inproceedings{camacho-collados-etal-2022-tweetnlp,
    title={{T}weet{NLP}: {C}utting-{E}dge {N}atural {L}anguage {P}rocessing for {S}ocial {M}edia},
    author={Camacho-Collados, Jose and Rezaee, Kiamehr and Riahi, Talayeh and Ushio, Asahi and Loureiro, Daniel and Antypas, Dimosthenis and Boisson, Joanne and Espinosa-Anke, Luis and Liu, Fangyu and Mart{'\i}nez-C{'a}mara, Eugenio and others},
    author = "Ushio, Asahi  and
      Camacho-Collados, Jose",
    booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
    month = nov,
    year = "2022",
    address = "Abu Dhabi, U.A.E.",
    publisher = "Association for Computational Linguistics",
}
Downloads last month
181
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train cardiffnlp/twitter-roberta-base-2021-124m-topic-multi

Evaluation results

  • Micro F1 (cardiffnlp/tweet_topic_multi) on cardiffnlp/tweet_topic_multi
    self-reported
    0.753
  • Macro F1 (cardiffnlp/tweet_topic_multi) on cardiffnlp/tweet_topic_multi
    self-reported
    0.556
  • Accuracy (cardiffnlp/tweet_topic_multi) on cardiffnlp/tweet_topic_multi
    self-reported
    0.535