Edit model card

This is ONLY the LoRA adapters, and not the full model!

Base model: https://huggingface.co/mesolitica/malaysian-tinyllama-1.1b-16k-instructions-v4

Fine-tuned on this dataset: https://huggingface.co/datasets/kaiimran/malaysia-tweets-sentiment

Following this tutorial: https://colab.research.google.com/drive/1AZghoNBQaMDgWJpi4RbffGM1h6raLUj9?usp=sharing

Evaluation on test dataset

  1. Accuracy: 0.9455

    • Interpretation: Approximately 94.55% of the predictions made by the model are correct. This is a high accuracy rate, indicating that the model performs well on the test dataset overall.
  2. Precision: 0.9936

    • Interpretation: Out of all the positive predictions made by the model, 99.36% were correct. This suggests that the model is very good at identifying true positive cases and has a very low false positive rate.
  3. Recall: 0.8980

    • Interpretation: Out of all the actual positive cases in the dataset, the model correctly identified 89.80% of them. While this is a good recall rate, it is relatively lower compared to precision, indicating that there are some false negatives (i.e., positive cases that the model failed to identify).
  4. F1 Score: 0.9434

    • Interpretation: The F1 score is the harmonic mean of precision and recall, balancing the two. An F1 score of 0.9434 indicates that the model achieves a good balance between precision and recall.

Overall Assessment

  • High Precision: The model has an excellent precision score, meaning it is highly reliable in predicting positive sentiment without mistakenly labeling too many negative cases as positive.
  • Good Recall: The recall score is also good, but slightly lower than precision, suggesting that there are some positive cases that the model misses.
  • Balanced Performance: The F1 score indicates that the model maintains a good balance between precision and recall, which is crucial for tasks like sentiment analysis.

Considerations for Improvement

  • Recall Improvement: Since recall is lower compared to precision, we might consider strategies to improve it, such as:
    • Data Augmentation: Adding more training data, particularly positive samples, might help the model learn to identify positive cases better.
    • Hyperparameter tuning: Like changing epochs, etc

Conclusion

The model shows strong performance, with particularly high precision and a good overall F1 score. The slightly lower recall suggests room for improvement, but the current metrics indicate that the model is very effective for binary sentiment analysis.

Downloads last month
1
Inference API
Unable to determine this model’s pipeline type. Check the docs .