Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

tweet-topic-latest-single

This is a RoBERTa-base model trained on 168.86M tweets until the end of September 2022 and finetuned for single-label topic classification on a corpus of 6,997 tweets. The original RoBERTa-base model can be found here. This model is suitable for English.

Labels:

  • 0 -> arts_&_culture;
  • 1 -> business_&_entrepreneurs;
  • 2 -> pop_culture;
  • 3 -> daily_life;
  • 4 -> sports_&_gaming;
  • 5 -> science_&_technology

Full classification example

from transformers import AutoModelForSequenceClassification, TFAutoModelForSequenceClassification
from transformers import AutoTokenizer
import numpy as np
from scipy.special import softmax

    
MODEL = f"cardiffnlp/tweet-topic-latest-single"
tokenizer = AutoTokenizer.from_pretrained(MODEL)

# PT
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
class_mapping = model.config.id2label

text = "Tesla stock is on the rise!"
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

scores = output[0][0].detach().numpy()
scores = softmax(scores)

# TF
#model = TFAutoModelForSequenceClassification.from_pretrained(MODEL)
#class_mapping = model.config.id2label
#text = "Tesla stock is on the rise!"
#encoded_input = tokenizer(text, return_tensors='tf')
#output = model(**encoded_input)
#scores = output[0][0]
#scores = softmax(scores)


ranking = np.argsort(scores)
ranking = ranking[::-1]
for i in range(scores.shape[0]):
    l = class_mapping[ranking[i]]
    s = scores[ranking[i]]
    print(f"{i+1}) {l} {np.round(float(s), 4)}")

Output:

1) business_&_entrepreneurs 0.8929
2) sports_&_gaming 0.0478
3) science_&_technology 0.0185
4) daily_life 0.0178
5) arts_&_culture 0.0128
6) pop_culture 0.0102
Downloads last month
66
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including cardiffnlp/tweet-topic-latest-single