cdhinrichs commited on
Commit
127c813
1 Parent(s): 07bbebf

Added a model card

Browse files
Files changed (1) hide show
  1. README.md +97 -0
README.md CHANGED
@@ -1,3 +1,100 @@
1
  ---
 
 
2
  license: mit
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - "en"
4
  license: mit
5
+ datasets:
6
+ - glue
7
+ metrics:
8
+ - Classification accuracy
9
  ---
10
+
11
+
12
+ # Model Card for cdhinrichs/albert-large-v2-sst2
13
+ This model was finetuned on the GLUE/sst2 task, based on the pretrained
14
+ albert-large-v2 model. Hyperparameters were (largely) taken from the following
15
+ publication, with some minor exceptions.
16
+
17
+ ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
18
+ https://arxiv.org/abs/1909.11942
19
+
20
+ ## Model Details
21
+
22
+ ### Model Description
23
+ - **Developed by:** https://huggingface.co/cdhinrichs
24
+ - **Model type:** Text Sequence Classification
25
+ - **Language(s) (NLP):** English
26
+ - **License:** MIT
27
+ - **Finetuned from model:** https://huggingface.co/albert-large-v2
28
+
29
+ ## Uses
30
+ Text classification, research and development.
31
+
32
+ ### Out-of-Scope Use
33
+ Not intended for production use.
34
+ See https://huggingface.co/albert-large-v2
35
+
36
+ ## Bias, Risks, and Limitations
37
+ See https://huggingface.co/albert-large-v2
38
+
39
+ ### Recommendations
40
+ See https://huggingface.co/albert-large-v2
41
+
42
+
43
+ ## How to Get Started with the Model
44
+
45
+ Use the code below to get started with the model.
46
+
47
+ ```python
48
+ from transformers import AlbertForSequenceClassification
49
+ model = AlbertForSequenceClassification.from_pretrained("cdhinrichs/albert-large-v2-sst2")
50
+ ```
51
+
52
+ ## Training Details
53
+
54
+ ### Training Data
55
+ See https://huggingface.co/datasets/glue#sst2
56
+
57
+ SST2 is a classification task, and a part of the GLUE benchmark.
58
+
59
+
60
+ ### Training Procedure
61
+ Adam optimization was used on the pretrained ALBERT model at
62
+ https://huggingface.co/albert-large-v2.
63
+
64
+ ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
65
+ https://arxiv.org/abs/1909.11942
66
+
67
+
68
+ #### Training Hyperparameters
69
+ Training hyperparameters, (Learning Rate, Batch Size, ALBERT dropout rate,
70
+ Classifier Dropout Rate, Warmup Steps, Training Steps,) were taken from Table
71
+ A.4 in,
72
+
73
+ ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
74
+ https://arxiv.org/abs/1909.11942
75
+
76
+ Max sequence length (MSL) was set to 128, differing from the above.
77
+
78
+
79
+ ## Evaluation
80
+ Classification accuracy is used to evaluate model performance.
81
+
82
+
83
+ ### Testing Data, Factors & Metrics
84
+
85
+ #### Testing Data
86
+ See https://huggingface.co/datasets/glue#sst2
87
+
88
+ #### Metrics
89
+ Classification accuracy
90
+
91
+ ### Results
92
+ Training Classification accuracy: 0.9990794221146565
93
+
94
+ Evaluation Classification accuracy: 0.9461009174311926
95
+
96
+
97
+ ## Environmental Impact
98
+ The model was finetuned on a single user workstation with a single GPU. CO2
99
+ impact is expected to be minimal.
100
+