Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)


This is a RoBERTa-base model trained on ~90m tweets until the end of 2019 (see here) and finetuned for multi-label topic classification on a corpus of 11,267 tweets. The original RoBERTa-base model can be found here and the original reference paper is TweetEval. This model is suitable for English.


0: arts_&_culture 5: fashion_&_style 10: learning_&_educational 15: science_&_technology
1: business_&_entrepreneurs 6: film_tv_&_video 11: music 16: sports
2: celebrity_&_pop_culture 7: fitness_&_health 12: news_&_social_concern 17: travel_&_adventure
3: diaries_&_daily_life 8: food_&_dining 13: other_hobbies 18: youth_&_student_life
4: family 9: gaming 14: relationships

Full classification example

from transformers import AutoModelForSequenceClassification, TFAutoModelForSequenceClassification
from transformers import AutoTokenizer
import numpy as np
from scipy.special import expit

MODEL = f"cardiffnlp/tweet-topic-19-multi"
tokenizer = AutoTokenizer.from_pretrained(MODEL)

# PT
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
class_mapping = model.config.id2label

text = "It is great to see athletes promoting awareness for climate change."
tokens = tokenizer(text, return_tensors='pt')
output = model(**tokens)

scores = output[0][0].detach().numpy()
scores = expit(scores)
predictions = (scores >= 0.5) * 1

# TF
#tf_model = TFAutoModelForSequenceClassification.from_pretrained(MODEL)
#class_mapping = tf_model.config.id2label
#text = "It is great to see athletes promoting awareness for climate change."
#tokens = tokenizer(text, return_tensors='tf')
#output = tf_model(**tokens)
#scores = output[0][0]
#scores = expit(scores)
#predictions = (scores >= 0.5) * 1

# Map to classes
for i in range(len(predictions)):
  if predictions[i]:


Downloads last month
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including cardiffnlp/tweet-topic-19-multi