README.md · HausaNLP/afrisenti-swa-regression at a9d784f2b0a14191e8ce794c99feba9e6f6a50eb

metadata

library_name: transformers
tags: []

AfriSenti Swahili Sentiment Regressor Description

Takes a text and predicts the sentiment value between -1 (Negative) to 1 (Positive) with 0 being Neutral.

Regression Value Description:

Value	Sentiment
-1	Negative
0	Neutral
1	Positive

How to Get Started with the Model

Use the code below to get started with the model.

import math
import torch
import pandas as pd
from transformers import AutoModelForSequenceClassification, AutoTokenizer

BATCH_SIZE = 32
ds = pd.read_csv('test.csv')
BASE_MODEL = 'HausaNLP/afrisenti-swa-regression'

device = 'cuda' if torch.cuda.is_available() else 'cpu'

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
model = AutoModelForSequenceClassification.from_pretrained(BASE_MODEL)

nb_batches = math.ceil(len(ds)/BATCH_SIZE)
y_preds = []

for i in range(nb_batches):
  input_texts = ds[i * BATCH_SIZE: (i+1) * BATCH_SIZE]["tweet"]
  encoded = tokenizer(input_texts, truncation=True, padding="max_length", max_length=256, return_tensors="pt").to(device)
  y_preds += model(**encoded).logits.reshape(-1).tolist()

df = pd.DataFrame([ds['tweet'], ds['label'], y_preds], ["Text", "Label", "Prediction"]).T
df.to_csv('predictions.csv', index=False)