dsfsi
/

PuoBERTa-News

Text Classification

Inference Endpoints

Model card Files Files and versions Community

vukosi commited on Oct 13, 2023

Commit

954b4a5

•

1 Parent(s): 18993d5

Update README.md

Files changed (1) hide show

README.md +87 -0

README.md CHANGED Viewed

@@ -1,3 +1,90 @@
 ---
 license: cc-by-4.0
 ---

 ---
 license: cc-by-4.0
+language:
+- tn
+datasets:
+- dsfsi/PuoData
+metrics:
+- f1
+library_name: transformers
+pipeline_tag: text-classification
+tags:
+- iptc
 ---
+# PuoBERTa-News: A Setswana Langauge Model Finetuned for News Categorisation
+A Roberta-based language model finetuned for News Categorisation.
+Based on [https://huggingface.co/dsfsi/PuoBERTa](https://huggingface.co/dsfsi/PuoBERTa)
+## Model Details
+### Model Description
+This is a News Categorisation model for Setswana.
+- **Developed by:** Vukosi Marivate ([@vukosi](https://huggingface.co/@vukosi)), Moseli Mots'Oehli ([@MoseliMotsoehli](https://huggingface.co/@MoseliMotsoehli)) , Valencia Wagner, Richard Lastrucci and Isheanesu Dzingirai
+- **Model type:** RoBERTa Model
+- **Language(s) (NLP):** Setswana
+- **License:** CC BY 4.0
+### News Categories
+0: arts_culture_entertainment_and_media
+1: crime_law_and_justice
+2: disaster_accident_and_emergency_incident
+3: economy_business_and_finance
+4: education
+5: environment
+6: health
+7: politics
+8: religion_and_belief
+9: society
+### Model Performance
+Performance of models on Daily News Dikgang dataset
+| **Model**             | **5-fold Cross Validation F1** | **Test F1** |
+|-----------------------------|--------------------------------------|-------------------|
+| Logistic Regression + TFIDF | 60.1                                 | 56.2              |
+| NCHLT TSN RoBERTa           | 64.7                                 | 60.3              |
+| PuoBERTa                    | 63.8                                 | 62.9              |
+| PuoBERTaJW300               | extbf{66.2}                          | **65.4**
+### Usage
+Use this model for Part of Speech Tagging for Setswana.
+```python
+```
+## Citation Information
+Bibtex Reference
+```
+@article{marivatePuoBERTa2023,
+  title={PuoBERTa: Training and evaluation of a curated language model for Setswana},
+  author={Vukosi Marivate and Moseli Mots'Oehli and Valencia Wagner and Richard Lastrucci and Isheanesu Dzingirai},
+  journal={ArXiv},
+}
+```
+## Contributing
+Your contributions are welcome! Feel free to improve the model.
+## Model Card Authors
+Vukosi Marivate
+## Model Card Contact
+For more details, reach out or check our [website](https://dsfsi.github.io/).
+Email: vukosi.marivate@cs.up.ac.za
+**Enjoy exploring Setswana through AI!**