MoritzLaurer commited on
Commit
c0fead2
·
1 Parent(s): 403444f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -0
README.md CHANGED
@@ -62,6 +62,56 @@ The model can only do text classification tasks.
62
 
63
  Please consult the original DeBERTa paper and the papers for the different datasets for potential biases.
64
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
  ## License
66
  The base model (DeBERTa-v3) is published under the MIT license.
67
  The datasets the model was fine-tuned on are published under a diverse set of licenses.
 
62
 
63
  Please consult the original DeBERTa paper and the papers for the different datasets for potential biases.
64
 
65
+ ## Metrics
66
+
67
+ Balanced accuracy metrics on all datasets. `deberta-v3-large-zeroshot-v1.1-heldout` indicates zeroshot performance on the respective dataset.
68
+ To calculate this metrics, 28 different models were trained, each with one dataset held out from training to simulate a zeroshot setup.
69
+ `deberta-v3-large-zeroshot-v1.1-all-33` was trained on all datasets, with only maximum 500 texts per class to avoid overfitting.
70
+ (The metrics in the last column are therefore not strictly zeroshot.)
71
+
72
+
73
+ | | deberta-v3-large-mnli-fever-anli-ling-wanli-binary | deberta-v3-large-zeroshot-v1.1-heldout | deberta-v3-large-zeroshot-v1.1-all-33 |
74
+ |:---------------------------|----------------------------:|-----------------------------------------:|----------------------------------------:|
75
+ | datasets mean (w/o nli) | 64.1 | 73.4 | 85.2 |
76
+ | amazonpolarity (2) | 94.7 | 96.6 | 96.8 |
77
+ | imdb (2) | 90.3 | 95.2 | 95.5 |
78
+ | appreviews (2) | 93.6 | 94.3 | 94.7 |
79
+ | yelpreviews (2) | 98.5 | 98.4 | 98.9 |
80
+ | rottentomatoes (2) | 83.9 | 90.5 | 90.8 |
81
+ | emotiondair (6) | 49.2 | 42.1 | 72.1 |
82
+ | emocontext (4) | 57 | 69.3 | 82.4 |
83
+ | empathetic (32) | 42 | 34.4 | 58 |
84
+ | financialphrasebank (3) | 77.4 | 77.5 | 91.9 |
85
+ | banking77 (72) | 29.1 | 52.8 | 72.2 |
86
+ | massive (59) | 47.3 | 64.7 | 77.3 |
87
+ | wikitoxic_toxicaggreg (2) | 81.6 | 86.6 | 91 |
88
+ | wikitoxic_obscene (2) | 85.9 | 91.9 | 93.1 |
89
+ | wikitoxic_threat (2) | 77.9 | 93.7 | 97.6 |
90
+ | wikitoxic_insult (2) | 77.8 | 91.1 | 92.3 |
91
+ | wikitoxic_identityhate (2) | 86.4 | 89.8 | 95.7 |
92
+ | hateoffensive (3) | 62.8 | 66.5 | 88.4 |
93
+ | hatexplain (3) | 46.9 | 61 | 76.9 |
94
+ | biasframes_offensive (2) | 62.5 | 86.6 | 89 |
95
+ | biasframes_sex (2) | 87.6 | 89.6 | 92.6 |
96
+ | biasframes_intent (2) | 54.8 | 88.6 | 89.9 |
97
+ | agnews (4) | 81.9 | 82.8 | 90.9 |
98
+ | yahootopics (10) | 37.7 | 65.6 | 74.3 |
99
+ | trueteacher (2) | 51.2 | 54.9 | 86.6 |
100
+ | spam (2) | 52.6 | 51.8 | 97.1 |
101
+ | wellformedquery (2) | 49.9 | 40.4 | 82.7 |
102
+ | manifesto (56) | 10.6 | 29.4 | 44.1 |
103
+ | capsotu (21) | 23.2 | 69.4 | 74 |
104
+ | mnli_m (2) | 93.1 | nan | 93.1 |
105
+ | mnli_mm (2) | 93.2 | nan | 93.2 |
106
+ | fevernli (2) | 89.3 | nan | 89.5 |
107
+ | anli_r1 (2) | 87.9 | nan | 87.3 |
108
+ | anli_r2 (2) | 76.3 | nan | 78 |
109
+ | anli_r3 (2) | 73.6 | nan | 74.1 |
110
+ | wanli (2) | 82.8 | nan | 82.7 |
111
+ | lingnli (2) | 90.2 | nan | 89.6 |
112
+
113
+
114
+
115
  ## License
116
  The base model (DeBERTa-v3) is published under the MIT license.
117
  The datasets the model was fine-tuned on are published under a diverse set of licenses.