orYx-models
/

finetuned-roberta-leadership-sentiment-analysis

Text Classification

Adapters

Safetensors

English

roberta

Model card Files Files and versions Community

Vineedhar commited on Apr 22

Commit

dd1d390

•

1 Parent(s): 691d7ae

Update README.md

Browse files

Files changed (1) hide show

README.md +44 -43

README.md CHANGED Viewed

@@ -9,69 +9,47 @@ pipeline_tag: text-classification
 # Model Card for orYx-models/finetuned-roberta-leadership-sentiment-analysis
-- This model is a finetuned version of, roberta text classifier. The finetuning has been done on the dataset which includes inputs from corporate executives to their therapist.
-  The sole purpose of the model is to determine wether the statement made from the corporate executives is "Positive, Negative, or Neutral" with which we will also see "Confidence level, i.e the percentage of the sentiment involved in a statement.
-  The sentiment analysis tool has been particularly built for our client firm called "LDS".
-  Since it is prototype tool by orYx Models, all the feedback and insights from LDS will be used to finetune the model further.
 ## Model Details
-### Model Description
-- This model is finetuned on a RoBERTa-base model trained on ~124M tweets from January 2018 to December 2021,and finetuned for sentiment analysis with the TweetEval benchmark.
-  The original Twitter-based RoBERTa model can be found here and the original reference paper is TweetEval.
-  This model is suitable for English.
-- **Developed by:** orYx Models
-- **Shared by [optional]:** Vineedhar, relkino, kalhosni
-- **Model type:** Text Classifier
-- **Language(s) (NLP):** English
 - **License:** MIT
-- **Finetuned from model [optional]:** cardiffnlp/twitter-roberta-base-sentiment-latest
-### Model Sources [optional]
-This is HuggingFace modelID - cardiffnlp/twitter-roberta-base-2021-124m
-- **Repository:** More Information Needed
-- **Paper [optional]:** TimeLMs - https://arxiv.org/abs/2202.03829
 ## Uses
--The Sentiment Analysis tool is made domain specific, however since it is a protoype, the depths into domain are still to be ventured.
-- **Use case:** We can analyse the text from any executive, employee, client of an organization and attach a sentiment to it.
-- The outcomes of this will be a "Scored sentiment" upon which we can look for likeliness of an event occurring or vice versa.
-- The resultant scenario to this will be to generate a rating system based on the sentiments generated by texts from an entity.
 ### Direct Use
-```
-nlp = pipeline("sentiment-analysis", model = model, tokenizer = tokenizer)
-nlp("The results don't match but the effort seems to be always high")
 Out[7]: [{'label': 'Positive', 'score': 0.9996090531349182}]
 ```
 ### Recommendations
 ## Training Details
-### Training Data
 ```
 X_train, X_val, y_train, y_val = train_test_split(X,y, test_size = 0.2, stratify = y)
@@ -84,7 +62,11 @@ X_train, X_val, y_train, y_val = train_test_split(X,y, test_size = 0.2, stratify
 ### Training Procedure
 #### Preprocessing [optional]
 ```
@@ -114,11 +96,14 @@ args = TrainingArguments(
 )
 ```
 #### Speeds, Sizes, Times [optional]
-```
 - **TrainOutput**
 global_step=879,
 training_loss=0.1825900522650848,
 - **Metrics**
 'train_runtime': 101.6309,
 'train_samples_per_second': 34.596,
 'train_steps_per_second': 8.649,
@@ -129,6 +114,22 @@ training_loss=0.1825900522650848,
 ## Evaluation Metrics Results
 **loss**
 - train   0.049349

 # Model Card for orYx-models/finetuned-roberta-leadership-sentiment-analysis
+- **Model Description:** This model is a finetuned version of the RoBERTa text classifier. It has been trained on a dataset comprising communications from corporate executives to their therapists. Its primary function is to determine whether statements from corporate executives convey a "Positive," "Negative," or "Neutral" sentiment, accompanied by a confidence level indicating the percentage of sentiment expressed in a statement. The sentiment analysis tool is specifically developed for our client firm called "LDS." Being a prototype tool by orYx Models, all feedback and insights from LDS will be used to further refine the model.
 ## Model Details
+### Model Information
+- **Model Type:** Text Classifier
+- **Language(s):** English
 - **License:** MIT
+- **Finetuned from Model:** cardiffnlp/twitter-roberta-base-sentiment-latest
+### Model Sources
+- **HuggingFace Model ID:** cardiffnlp/twitter-roberta-base-2021-124m
+- **Paper:** TimeLMs - [Link](https://arxiv.org/abs/2202.03829)
 ## Uses
+- **Use case:** This sentiment analysis tool can analyze text from any entity within an organization, such as executives, employees, or clients, and assign a sentiment to it.
+- **Outcomes:** The tool generates a "Scored sentiment" which can be used to assess the likelihood of events occurring or vice versa. It can also facilitate the creation of a rating system based on the sentiments expressed in texts.
 ### Direct Use
+```python
+nlp = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
+nlp("The results don't match, but the effort seems to be always high")
 Out[7]: [{'label': 'Positive', 'score': 0.9996090531349182}]
 ```
+- Based on the text the outcomes can be "Positive, Negative, Neutral" along with their confidence score.
+-
 ### Recommendations
+- Continuous Monitoring: Regularly monitor the model's performance on new data to ensure its effectiveness and reliability over time.
+- Error Analysis: Conduct thorough error analysis to identify common patterns of misclassifications and areas for improvement.
+- Fine-Tuning: Consider fine-tuning the model further based on feedback and insights from users, especially from LDS, to enhance its domain-specific performance.
+- Model Interpretability: Explore techniques for explaining the model's predictions, such as attention mechanisms or feature importance analysis, to increase trust and understanding of its decisions.
 ## Training Details
 ```
 X_train, X_val, y_train, y_val = train_test_split(X,y, test_size = 0.2, stratify = y)
 ### Training Procedure
+**Dataset Split:** Data divided into 80% training and 20% validation sets.
+**Preprocessing:** Input data tokenized into 'input_ids' and 'attention_mask' tensors.
+**Training Hyperparameters:** Set for training, evaluation, and optimization, including batch size, epochs, and logging strategies.
+**Training Execution:** Model trained with specified hyperparameters, monitored with metrics, and logged for evaluation.
+**Evaluation Metrics:** Model evaluated on loss, accuracy, F1 score, precision, and recall for both training and validation sets.
 #### Preprocessing [optional]
 ```
 )
 ```
 #### Speeds, Sizes, Times [optional]
 - **TrainOutput**
+```
 global_step=879,
 training_loss=0.1825900522650848,
+```
 - **Metrics**
+- ```
 'train_runtime': 101.6309,
 'train_samples_per_second': 34.596,
 'train_steps_per_second': 8.649,
 ## Evaluation Metrics Results
+```
+# Assuming you have a list of evaluation results q and want to create a DataFrame with it
+q = [Trainer.evaluate(eval_dataset=df) for df in [train_dataset, val_dataset]]
+# Create DataFrame with index and select only the first 5 columns
+result_df = pd.DataFrame(q, index=["train", "val"]).iloc[:,:5]
+# Display the resulting DataFrame
+print(result_df)
+______________________________________________________________________
+eval_loss  eval_Accuracy   eval_F1  eval_Precision  eval_Recall
+train   0.049349       0.988908  0.987063        0.982160     0.992357
+val     0.108378       0.976136  0.972464        0.965982     0.979861
+______________________________________________________________________
+```
 **loss**
 - train   0.049349