SamLowe commited on
Commit
ed94491
1 Parent(s): a404282

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -3
README.md CHANGED
@@ -38,9 +38,6 @@ Using a fixed threshold of 0.5 to convert the scores to binary predictions for e
38
  - Recall: 0.250
39
  - F1: 0.303
40
 
41
- Simple mean of labels: {'precision': 0.445, 'recall': 0.476, 'f1': 0.449}
42
- Weighted average (using support): {'precision': 0.472, 'recall': 0.582, 'f1': 0.514}
43
-
44
  Optimising the threshold per label to optimise the F1 metric, the metrics (evaluated on the go_emotions test split) are:
45
 
46
  - Precision: 0.445
@@ -57,6 +54,39 @@ Weighted by the relative support of each label in the dataset, this is:
57
 
58
  This is a multi-label, multi-class dataset, so each label is effectively a separate binary classification and metrics are better measured per label.
59
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
  Using a fixed threshold of 0.5 to convert the scores to binary predictions for each label, the metrics (evaluated on the go_emotions test split) are:
61
 
62
  | | f1 | precision | recall | support | threshold |
@@ -90,4 +120,8 @@ Using a fixed threshold of 0.5 to convert the scores to binary predictions for e
90
  | surprise | 0.142 | 0.786 | 0.078 | 141 | 0.5 |
91
  | neutral | 0.547 | 0.644 | 0.475 | 1787 | 0.5 |
92
 
 
93
 
 
 
 
 
38
  - Recall: 0.250
39
  - F1: 0.303
40
 
 
 
 
41
  Optimising the threshold per label to optimise the F1 metric, the metrics (evaluated on the go_emotions test split) are:
42
 
43
  - Precision: 0.445
 
54
 
55
  This is a multi-label, multi-class dataset, so each label is effectively a separate binary classification and metrics are better measured per label.
56
 
57
+ Optimising the threshold per label to optimise the F1 metric, the metrics (evaluated on the go_emotions test split) are:
58
+
59
+ | | f1 | precision | recall | support | threshold |
60
+ | -------------- | ----- | --------- | ------ | ------- | --------- |
61
+ | admiration | 0.583 | 0.574 | 0.593 | 504 | 0.30 |
62
+ | amusement | 0.668 | 0.722 | 0.621 | 264 | 0.25 |
63
+ | anger | 0.350 | 0.309 | 0.404 | 198 | 0.15 |
64
+ | annoyance | 0.299 | 0.318 | 0.281 | 320 | 0.20 |
65
+ | approval | 0.338 | 0.281 | 0.425 | 351 | 0.15 |
66
+ | caring | 0.321 | 0.323 | 0.319 | 135 | 0.20 |
67
+ | confusion | 0.384 | 0.313 | 0.497 | 153 | 0.15 |
68
+ | curiosity | 0.467 | 0.432 | 0.507 | 284 | 0.20 |
69
+ | desire | 0.426 | 0.381 | 0.482 | 83 | 0.20 |
70
+ | disappointment | 0.210 | 0.147 | 0.364 | 151 | 0.10 |
71
+ | disapproval | 0.366 | 0.288 | 0.502 | 267 | 0.15 |
72
+ | disgust | 0.416 | 0.409 | 0.423 | 123 | 0.20 |
73
+ | embarrassment | 0.370 | 0.341 | 0.405 | 37 | 0.30 |
74
+ | excitement | 0.313 | 0.368 | 0.272 | 103 | 0.25 |
75
+ | fear | 0.615 | 0.677 | 0.564 | 78 | 0.40 |
76
+ | gratitude | 0.828 | 0.810 | 0.847 | 352 | 0.25 |
77
+ | grief | 0.545 | 0.600 | 0.500 | 6 | 0.85 |
78
+ | joy | 0.455 | 0.429 | 0.484 | 161 | 0.20 |
79
+ | love | 0.642 | 0.673 | 0.613 | 238 | 0.30 |
80
+ | nervousness | 0.350 | 0.412 | 0.304 | 23 | 0.60 |
81
+ | optimism | 0.439 | 0.417 | 0.462 | 186 | 0.20 |
82
+ | pride | 0.480 | 0.667 | 0.375 | 16 | 0.70 |
83
+ | realization | 0.232 | 0.191 | 0.297 | 145 | 0.10 |
84
+ | relief | 0.353 | 0.500 | 0.273 | 11 | 0.50 |
85
+ | remorse | 0.643 | 0.529 | 0.821 | 56 | 0.20 |
86
+ | sadness | 0.526 | 0.497 | 0.558 | 156 | 0.20 |
87
+ | surprise | 0.329 | 0.318 | 0.340 | 141 | 0.15 |
88
+ | neutral | 0.634 | 0.528 | 0.794 | 1787 | 0.30 |
89
+
90
  Using a fixed threshold of 0.5 to convert the scores to binary predictions for each label, the metrics (evaluated on the go_emotions test split) are:
91
 
92
  | | f1 | precision | recall | support | threshold |
 
120
  | surprise | 0.142 | 0.786 | 0.078 | 141 | 0.5 |
121
  | neutral | 0.547 | 0.644 | 0.475 | 1787 | 0.5 |
122
 
123
+ ### Use with ONNXRuntime
124
 
125
+ ```python
126
+ pass
127
+ ```