51la5 commited on
Commit
dc2429a
1 Parent(s): db62046

Upload 8 files

Browse files
Files changed (8) hide show
  1. README.md +179 -0
  2. config.json +31 -0
  3. map.jpeg +0 -0
  4. pytorch_model.bin +3 -0
  5. rust_model.ot +3 -0
  6. tf_model.h5 +3 -0
  7. tokenizer_config.json +1 -0
  8. vocab.txt +0 -0
README.md ADDED
@@ -0,0 +1,179 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ datasets:
5
+ - sst2
6
+ - glue
7
+ model-index:
8
+ - name: distilbert-base-uncased-finetuned-sst-2-english
9
+ results:
10
+ - task:
11
+ type: text-classification
12
+ name: Text Classification
13
+ dataset:
14
+ name: glue
15
+ type: glue
16
+ config: sst2
17
+ split: validation
18
+ metrics:
19
+ - name: Accuracy
20
+ type: accuracy
21
+ value: 0.9105504587155964
22
+ verified: true
23
+ - name: Precision
24
+ type: precision
25
+ value: 0.8978260869565218
26
+ verified: true
27
+ - name: Recall
28
+ type: recall
29
+ value: 0.9301801801801802
30
+ verified: true
31
+ - name: AUC
32
+ type: auc
33
+ value: 0.9716626673402374
34
+ verified: true
35
+ - name: F1
36
+ type: f1
37
+ value: 0.9137168141592922
38
+ verified: true
39
+ - name: loss
40
+ type: loss
41
+ value: 0.39013850688934326
42
+ verified: true
43
+ - task:
44
+ type: text-classification
45
+ name: Text Classification
46
+ dataset:
47
+ name: sst2
48
+ type: sst2
49
+ config: default
50
+ split: train
51
+ metrics:
52
+ - name: Accuracy
53
+ type: accuracy
54
+ value: 0.9885521685548412
55
+ verified: true
56
+ - name: Precision Macro
57
+ type: precision
58
+ value: 0.9881965062029833
59
+ verified: true
60
+ - name: Precision Micro
61
+ type: precision
62
+ value: 0.9885521685548412
63
+ verified: true
64
+ - name: Precision Weighted
65
+ type: precision
66
+ value: 0.9885639626373408
67
+ verified: true
68
+ - name: Recall Macro
69
+ type: recall
70
+ value: 0.9886145346602994
71
+ verified: true
72
+ - name: Recall Micro
73
+ type: recall
74
+ value: 0.9885521685548412
75
+ verified: true
76
+ - name: Recall Weighted
77
+ type: recall
78
+ value: 0.9885521685548412
79
+ verified: true
80
+ - name: F1 Macro
81
+ type: f1
82
+ value: 0.9884019815052447
83
+ verified: true
84
+ - name: F1 Micro
85
+ type: f1
86
+ value: 0.9885521685548412
87
+ verified: true
88
+ - name: F1 Weighted
89
+ type: f1
90
+ value: 0.9885546181087554
91
+ verified: true
92
+ - name: loss
93
+ type: loss
94
+ value: 0.040652573108673096
95
+ verified: true
96
+ ---
97
+
98
+ # DistilBERT base uncased finetuned SST-2
99
+
100
+ ## Table of Contents
101
+ - [Model Details](#model-details)
102
+ - [How to Get Started With the Model](#how-to-get-started-with-the-model)
103
+ - [Uses](#uses)
104
+ - [Risks, Limitations and Biases](#risks-limitations-and-biases)
105
+ - [Training](#training)
106
+
107
+ ## Model Details
108
+ **Model Description:** This model is a fine-tune checkpoint of [DistilBERT-base-uncased](https://huggingface.co/distilbert-base-uncased), fine-tuned on SST-2.
109
+ This model reaches an accuracy of 91.3 on the dev set (for comparison, Bert bert-base-uncased version reaches an accuracy of 92.7).
110
+ - **Developed by:** Hugging Face
111
+ - **Model Type:** Text Classification
112
+ - **Language(s):** English
113
+ - **License:** Apache-2.0
114
+ - **Parent Model:** For more details about DistilBERT, we encourage users to check out [this model card](https://huggingface.co/distilbert-base-uncased).
115
+ - **Resources for more information:**
116
+ - [Model Documentation](https://huggingface.co/docs/transformers/main/en/model_doc/distilbert#transformers.DistilBertForSequenceClassification)
117
+
118
+ ## How to Get Started With the Model
119
+
120
+ Example of single-label classification:
121
+ ​​
122
+ ```python
123
+ import torch
124
+ from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
125
+
126
+ tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased")
127
+ model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased")
128
+
129
+ inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
130
+ with torch.no_grad():
131
+ logits = model(**inputs).logits
132
+
133
+ predicted_class_id = logits.argmax().item()
134
+ model.config.id2label[predicted_class_id]
135
+
136
+ ```
137
+
138
+ ## Uses
139
+
140
+ #### Direct Use
141
+
142
+ This model can be used for topic classification. You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to be fine-tuned on a downstream task. See the model hub to look for fine-tuned versions on a task that interests you.
143
+
144
+ #### Misuse and Out-of-scope Use
145
+ The model should not be used to intentionally create hostile or alienating environments for people. In addition, the model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.
146
+
147
+
148
+ ## Risks, Limitations and Biases
149
+
150
+ Based on a few experimentations, we observed that this model could produce biased predictions that target underrepresented populations.
151
+
152
+ For instance, for sentences like `This film was filmed in COUNTRY`, this binary classification model will give radically different probabilities for the positive label depending on the country (0.89 if the country is France, but 0.08 if the country is Afghanistan) when nothing in the input indicates such a strong semantic shift. In this [colab](https://colab.research.google.com/gist/ageron/fb2f64fb145b4bc7c49efc97e5f114d3/biasmap.ipynb), [Aurélien Géron](https://twitter.com/aureliengeron) made an interesting map plotting these probabilities for each country.
153
+
154
+ <img src="https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/map.jpeg" alt="Map of positive probabilities per country." width="500"/>
155
+
156
+ We strongly advise users to thoroughly probe these aspects on their use-cases in order to evaluate the risks of this model. We recommend looking at the following bias evaluation datasets as a place to start: [WinoBias](https://huggingface.co/datasets/wino_bias), [WinoGender](https://huggingface.co/datasets/super_glue), [Stereoset](https://huggingface.co/datasets/stereoset).
157
+
158
+
159
+
160
+ # Training
161
+
162
+
163
+ #### Training Data
164
+
165
+
166
+ The authors use the following Stanford Sentiment Treebank([sst2](https://huggingface.co/datasets/sst2)) corpora for the model.
167
+
168
+ #### Training Procedure
169
+
170
+ ###### Fine-tuning hyper-parameters
171
+
172
+
173
+ - learning_rate = 1e-5
174
+ - batch_size = 32
175
+ - warmup = 600
176
+ - max_seq_length = 128
177
+ - num_train_epochs = 3.0
178
+
179
+
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "activation": "gelu",
3
+ "architectures": [
4
+ "DistilBertForSequenceClassification"
5
+ ],
6
+ "attention_dropout": 0.1,
7
+ "dim": 768,
8
+ "dropout": 0.1,
9
+ "finetuning_task": "sst-2",
10
+ "hidden_dim": 3072,
11
+ "id2label": {
12
+ "0": "NEGATIVE",
13
+ "1": "POSITIVE"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "label2id": {
17
+ "NEGATIVE": 0,
18
+ "POSITIVE": 1
19
+ },
20
+ "max_position_embeddings": 512,
21
+ "model_type": "distilbert",
22
+ "n_heads": 12,
23
+ "n_layers": 6,
24
+ "output_past": true,
25
+ "pad_token_id": 0,
26
+ "qa_dropout": 0.1,
27
+ "seq_classif_dropout": 0.2,
28
+ "sinusoidal_pos_embds": false,
29
+ "tie_weights_": true,
30
+ "vocab_size": 30522
31
+ }
map.jpeg ADDED
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:60554cbd7781b09d87f1ececbea8c064b94e49a7f03fd88e8775bfe6cc3d9f88
3
+ size 267844284
rust_model.ot ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9db97da21b97a5e6db1212ce6a810a0c5e22c99daefe3355bae2117f78a0abb9
3
+ size 267846324
tf_model.h5 ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b44df675bb34ccd8e57c14292c811ac7358b7c8e37c7f212745f640cd6019ac8
3
+ size 267949840
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
1
+ {"model_max_length": 512, "do_lower_case": true}
vocab.txt ADDED
The diff for this file is too large to render. See raw diff