xlm-roberta-base-sentiment-multilingual-finetuned
Model description
This is a fine-tuned version of the cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual model, trained on the tyqiangz/multilingual-sentiments dataset. It's designed for multilingual sentiment analysis in English, Malay, and Chinese.
Intended uses & limitations
This model is intended for sentiment analysis tasks in English, Malay, and Chinese. It can classify text into three sentiment categories: positive, negative, and neutral.
Training and evaluation data
The model was trained and evaluated on the tyqiangz/multilingual-sentimentsTVL_Sentiment_Analysis , argilla/twitter-coronavirus datasets, which includes data in English, Malay, and Chinese.
Training procedure
The model was fine-tuned using the Hugging Face Transformers library.
training_args = TrainingArguments( output_dir="./results", num_train_epochs=2, per_device_train_batch_size=16, per_device_eval_batch_size=64, warmup_steps=500, weight_decay=0.01, logging_dir='./logs', logging_steps=10, evaluation_strategy="steps", save_strategy="steps", load_best_model_at_end=True, )
Evaluation results
Test results: {'eval_loss': 0.5399289727210999, 'eval_accuracy': 0.8276255707762558, 'eval_f1': 0.8264331626181614, 'eval_precision': 0.8267764353557178, 'eval_recall': 0.8276255707762558, 'eval_runtime': 11.3723, 'eval_samples_per_second': 231.087, 'eval_steps_per_second': 3.693, 'epoch': 2.0}
Environmental impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Downloads last month
- 0
Datasets used to train terrencewee12/xlm-roberta-base-sentiment-multilingual-finetuned-v2
Evaluation results
- accuracyself-reported0.828