Vineedhar commited on
Commit
1fb1087
1 Parent(s): 4be05be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -77
README.md CHANGED
@@ -9,11 +9,10 @@ pipeline_tag: text-classification
9
 
10
  # Model Card for orYx-models/finetuned-roberta-leadership-sentiment-analysis
11
 
12
- - This model is a finetuned version of, roberta text classifier.
13
- - The finetuning has been done on the dataset which includes inputs from corporate executives to their therapist.
14
- - The sole purpose of the model is to determine wether the statement made from the corporate executives is "Positive, Negative, or Neutral" with which we will also see "Confidence level, i.e the percentage of the sentiment involved in a statement.
15
- - The sentiment analysis tool has been particularly built for our client firm called "LDS".
16
- - Since it is prototype tool by orYx Models, all the feedback and insights from LDS will be used to finetune the model further.
17
 
18
 
19
 
@@ -22,8 +21,8 @@ pipeline_tag: text-classification
22
  ### Model Description
23
 
24
  - This model is finetuned on a RoBERTa-base model trained on ~124M tweets from January 2018 to December 2021,and finetuned for sentiment analysis with the TweetEval benchmark.
25
- - The original Twitter-based RoBERTa model can be found here and the original reference paper is TweetEval.
26
- - This model is suitable for English.
27
 
28
 
29
 
@@ -74,7 +73,7 @@ Out[7]: [{'label': 'Positive', 'score': 0.9996090531349182}]
74
  X_train, X_val, y_train, y_val = train_test_split(X,y, test_size = 0.2, stratify = y)
75
 
76
  - **Train data:** 80% of 4396 records = 3516
77
- - **Test data:** 20% of 4396 records = 789
78
 
79
 
80
  ### Training Procedure
@@ -90,97 +89,73 @@ X_train, X_val, y_train, y_val = train_test_split(X,y, test_size = 0.2, stratify
90
 
91
  #### Training Hyperparameters
92
 
93
- - args = TrainingArguments(
94
- output_dir="output",
95
- do_train = True,
96
- do_eval = True,
97
- num_train_epochs = 1,
98
- per_device_train_batch_size = 4,
99
- per_device_eval_batch_size = 8,
100
- warmup_steps = 50,
101
- weight_decay = 0.01,
102
- logging_strategy= "steps",
103
- logging_dir= "logging",
104
- logging_steps = 50,
105
- eval_steps = 50,
106
- save_strategy = "steps",
107
- fp16 = True,
108
- #load_best_model_at_end = True
109
- )
110
 
111
  #### Speeds, Sizes, Times [optional]
112
 
113
- TrainOutput(global_step=879,
114
- training_loss=0.1825900522650848,
115
- metrics={'train_runtime': 101.6309,
116
- 'train_samples_per_second': 34.596,
117
- 'train_steps_per_second': 8.649,
118
- 'total_flos': 346915041274368.0,
119
- 'train_loss': 0.1825900522650848,
120
- 'epoch': 1.0})
121
-
122
- ### Testing Data
123
-
124
- 20%, 789 points off 4396 population of the Dataset.
125
-
126
 
127
  #### Metrics
128
 
129
- Accuracy
130
- F1 Score
131
- Precision
132
- Recall
133
 
134
 
135
  ## Evaluation Results
136
 
137
 
138
- loss
139
- train 0.049349
140
- val 0.108378
141
-
142
- Accuracy
143
- train 0.988908
144
- val 0.976136
145
-
146
- F1
147
- train 0.987063
148
- val 0.972464
149
-
150
-
151
- Precision
152
- train 0.982160
153
- val 0.965982
154
-
155
- Recall
156
- train 0.992357
157
- val 0.979861
158
 
 
 
 
159
 
160
- #### Summary
 
 
161
 
162
- Accuracy
163
- train 98.8%
164
- val 97.6%
165
 
166
- F1
167
- train 98.7%
168
- val 97.2%
169
 
170
- Precision
171
- train 98.2%
172
- val 96.5%
173
 
174
- Recall
175
- train 99.2%
176
- val 97.9%
177
 
178
 
179
- {{ model_examination | default("[More Information Needed]", true)}}
180
 
181
  ## Environmental Impact
182
 
183
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
184
 
185
  Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
186
 
 
9
 
10
  # Model Card for orYx-models/finetuned-roberta-leadership-sentiment-analysis
11
 
12
+ - This model is a finetuned version of, roberta text classifier. The finetuning has been done on the dataset which includes inputs from corporate executives to their therapist.
13
+ The sole purpose of the model is to determine wether the statement made from the corporate executives is "Positive, Negative, or Neutral" with which we will also see "Confidence level, i.e the percentage of the sentiment involved in a statement.
14
+ The sentiment analysis tool has been particularly built for our client firm called "LDS".
15
+ Since it is prototype tool by orYx Models, all the feedback and insights from LDS will be used to finetune the model further.
 
16
 
17
 
18
 
 
21
  ### Model Description
22
 
23
  - This model is finetuned on a RoBERTa-base model trained on ~124M tweets from January 2018 to December 2021,and finetuned for sentiment analysis with the TweetEval benchmark.
24
+ The original Twitter-based RoBERTa model can be found here and the original reference paper is TweetEval.
25
+ This model is suitable for English.
26
 
27
 
28
 
 
73
  X_train, X_val, y_train, y_val = train_test_split(X,y, test_size = 0.2, stratify = y)
74
 
75
  - **Train data:** 80% of 4396 records = 3516
76
+ - **Test data:** 20% of 4396 records = 879
77
 
78
 
79
  ### Training Procedure
 
89
 
90
  #### Training Hyperparameters
91
 
92
+ - **TrainingArguments**
93
+ - output_dir="output",
94
+ - do_train = True,
95
+ - do_eval = True,
96
+ - num_train_epochs = 1,
97
+ - per_device_train_batch_size = 4,
98
+ - per_device_eval_batch_size = 8,
99
+ - warmup_steps = 50,
100
+ - weight_decay = 0.01,
101
+ - logging_strategy= "steps",
102
+ - logging_dir= "logging",
103
+ - logging_steps = 50,
104
+ - eval_steps = 50,
105
+ - save_strategy = "steps",
106
+ - fp16 = True,
107
+ - load_best_model_at_end = True
 
108
 
109
  #### Speeds, Sizes, Times [optional]
110
 
111
+ - **TrainOutput**
112
+ - global_step=879,
113
+ - training_loss=0.1825900522650848,
114
+ - **Metrics**
115
+ - 'train_runtime': 101.6309,
116
+ - 'train_samples_per_second': 34.596,
117
+ - 'train_steps_per_second': 8.649,
118
+ - 'total_flos': 346915041274368.0,
119
+ - 'train_loss': 0.1825900522650848,
120
+ - 'epoch': 1.0
 
 
 
121
 
122
  #### Metrics
123
 
124
+ - Accuracy
125
+ - F1 Score
126
+ - Precision
127
+ - Recall
128
 
129
 
130
  ## Evaluation Results
131
 
132
 
133
+ **loss**
134
+ - train 0.049349
135
+ - val 0.108378
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
136
 
137
+ **Accuracy**
138
+ - train 0.988908 - **98.8%**
139
+ - val 0.976136 - **97.6%**
140
 
141
+ **F1**
142
+ - train 0.987063 - **98.7%**
143
+ - val 0.972464 - **97.2%**
144
 
 
 
 
145
 
146
+ **Precision**
147
+ - train 0.982160 - **98.2%**
148
+ - val 0.965982 - **96.5%**
149
 
150
+ **Recall**
151
+ - train 0.992357 - **99.2%**
152
+ - val 0.979861 - **97.9%**
153
 
 
 
 
154
 
155
 
 
156
 
157
  ## Environmental Impact
158
 
 
159
 
160
  Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
161