--- library_name: peft base_model: cardiffnlp/twitter-roberta-base-sentiment-latest license: mit language: - en metrics: - accuracy pipeline_tag: text-classification tags: - NHL - Hockey - Sports - roberta - sentiment analysis --- # Chelberta This is a finetuned model of [cardiffnlp/twitter-roberta-base-sentiment-latest](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest) trained on 5168 sentiment labelled reddit comments from subreddits of NHL hockey teams in December 2023. This model is suitable for English. Labels: 0 -> Negative; 1 -> Neutral; 2 -> Positive This sentiment analysis has been used for the [NHL Positivity Index](https://uais.dev/projects/nhl-positivity-index/) The full dataset can be found [here](https://www.kaggle.com/datasets/jacobwinch/nhl-reddit-comments) ## Example Pipeline ```python from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer from peft import PeftModel import torch model_id = 'cardiffnlp/twitter-roberta-base-sentiment-latest' peft_model_id = 'UAlbertaUAIS/Chelberta' model = AutoModelForSequenceClassification.from_pretrained(model_id, num_labels=3) tokenizer = AutoTokenizer.from_pretrained(model_id, max_length=512) model = PeftModel.from_pretrained(model, peft_model_id) model = model.merge_and_unload() classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer, max_length = 512, truncation=True, device=0) classifier("Connor McDavid is good at hockey!") ``` ``` [{'label': 'positive', 'score': 0.9888942837715149}] ``` - **Developed by:** The Unversity of Alberta Undergraduate Artificial Intelligence Society Student Group - **Model type:** roberta based - **Language(s) (NLP):** English - **License:** MIT - **Finetuned from model [optional]:** [cardiffnlp/twitter-roberta-base-sentiment-latest](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest) - **Repository:** https://github.com/UndergraduateArtificialIntelligenceClub/NHL-Positivity-Index ## Uses Chelberta is inteded to be used to analysis the sentiment of sports fans on social media. ## Evaluation Chelberta was evaluated on a testing dataset of 1000 human labelled NHL Reddit comments from December 2023, the testing set can be found [here](https://github.com/UndergraduateArtificialIntelligenceClub/NHL-Positivity-Index/blob/main/data/training_data/NHL-SentiComments-1K-TEST.json). The model had an 81.4% accuracy score. ### References ``` @inproceedings{camacho-collados-etal-2022-tweetnlp, title = "{T}weet{NLP}: Cutting-Edge Natural Language Processing for Social Media", author = "Camacho-collados, Jose and Rezaee, Kiamehr and Riahi, Talayeh and Ushio, Asahi and Loureiro, Daniel and Antypas, Dimosthenis and Boisson, Joanne and Espinosa Anke, Luis and Liu, Fangyu and Mart{\'\i}nez C{\'a}mara, Eugenio" and others, booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations", month = dec, year = "2022", address = "Abu Dhabi, UAE", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.emnlp-demos.5", pages = "38--49" } ``` ``` @inproceedings{loureiro-etal-2022-timelms, title = "{T}ime{LM}s: Diachronic Language Models from {T}witter", author = "Loureiro, Daniel and Barbieri, Francesco and Neves, Leonardo and Espinosa Anke, Luis and Camacho-collados, Jose", booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations", month = may, year = "2022", address = "Dublin, Ireland", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.acl-demo.25", doi = "10.18653/v1/2022.acl-demo.25", pages = "251--260" } ``` ## Citation **APA:** Winch, J., Munjal, T., Lau, H., Bradley, A., Monaghan, A., & Subedi, Y. (2023). NHL Positivity Index. Undergraduate Artificial Intelligence Society. https://uais.dev/projects/nhl-positivity-index/ ### Framework versions - PEFT 0.9.0