model documentation

#2
by nazneen - opened
Files changed (1) hide show
  1. README.md +189 -0
README.md ADDED
@@ -0,0 +1,189 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - feature-extraction
4
+
5
+ ---
6
+ # Model Card for fixed-distilroberta-base
7
+
8
+
9
+ # Model Details
10
+
11
+ ## Model Description
12
+
13
+ - **Developed by:** Hamish Ivison
14
+ - **Shared by [Optional]:** More information needed
15
+ - **Model type:** Feature Extraction
16
+ - **Language(s) (NLP):** More information needed
17
+ - **License:** More information needed
18
+ - **Related Models:** distilroberta-base
19
+ - **Parent Model:** RoBERTa
20
+ - **Resources for more information:** More information needed
21
+
22
+ # Uses
23
+
24
+
25
+ ## Direct Use
26
+
27
+ This model can be used for the task of Feature Extraction
28
+
29
+ ## Downstream Use [Optional]
30
+
31
+ More information needed
32
+
33
+ ## Out-of-Scope Use
34
+
35
+ The model should not be used to intentionally create hostile or alienating environments for people.
36
+
37
+ # Bias, Risks, and Limitations
38
+
39
+ Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
40
+
41
+ The training data used for this model contains a lot of unfiltered content from the internet, which is far from neutral. Therefore, the model can have biased predictions:
42
+
43
+
44
+ ## Recommendations
45
+
46
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
47
+
48
+
49
+ # Training Details
50
+
51
+ ## Training Data
52
+
53
+ The RoBERTa model was pretrained on the reunion of five datasets:
54
+ - [BookCorpus](https://yknzhu.wixsite.com/mbweb), a dataset consisting of 11,038 unpublished books;
55
+ - [English Wikipedia](https://en.wikipedia.org/wiki/English_Wikipedia) (excluding lists, tables and headers) ;
56
+ - [CC-News](https://commoncrawl.org/2016/10/news-dataset-available/), a dataset containing 63 millions English news
57
+ articles crawled between September 2016 and February 2019.
58
+ - [OpenWebText](https://github.com/jcpeterson/openwebtext), an opensource recreation of the WebText dataset used to
59
+ train GPT-2,
60
+ - [Stories](https://arxiv.org/abs/1806.02847) a dataset containing a subset of CommonCrawl data filtered to match the
61
+ story-like style of Winograd schemas.
62
+
63
+
64
+
65
+ ## Training Procedure
66
+
67
+
68
+ ### Preprocessing
69
+
70
+ More information needed
71
+
72
+ ### Speeds, Sizes, Times
73
+
74
+ More information needed
75
+
76
+ # Evaluation
77
+
78
+
79
+ ## Testing Data, Factors & Metrics
80
+
81
+ ### Testing Data
82
+
83
+ More information needed
84
+
85
+ ### Factors
86
+
87
+
88
+ ### Metrics
89
+
90
+ More information needed
91
+ ## Results
92
+
93
+ More information needed
94
+
95
+ # Model Examination
96
+
97
+ More information needed
98
+
99
+ # Environmental Impact
100
+
101
+
102
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
103
+
104
+ - **Hardware Type:** More information needed
105
+ - **Hours used:** More information needed
106
+ - **Cloud Provider:** More information needed
107
+ - **Compute Region:** More information needed
108
+ - **Carbon Emitted:** More information needed
109
+
110
+ # Technical Specifications [optional]
111
+
112
+ ## Model Architecture and Objective
113
+
114
+ More information needed
115
+
116
+ ## Compute Infrastructure
117
+
118
+ More information needed
119
+
120
+ ### Hardware
121
+
122
+ More information needed
123
+
124
+ ### Software
125
+ More information needed
126
+
127
+ # Citation
128
+
129
+
130
+ **BibTeX:**
131
+ ```
132
+ @article{DBLP:journals/corr/abs-1907-11692,
133
+ author = {Yinhan Liu and
134
+ Myle Ott and
135
+ Naman Goyal and
136
+ Jingfei Du and
137
+ Mandar Joshi and
138
+ Danqi Chen and
139
+ Omer Levy and
140
+ Mike Lewis and
141
+ Luke Zettlemoyer and
142
+ Veselin Stoyanov},
143
+ title = {RoBERTa: {A} Robustly Optimized {BERT} Pretraining Approach},
144
+ journal = {CoRR},
145
+ volume = {abs/1907.11692},
146
+ year = {2019},
147
+ url = {http://arxiv.org/abs/1907.11692},
148
+ archivePrefix = {arXiv},
149
+ eprint = {1907.11692},
150
+ timestamp = {Thu, 01 Aug 2019 08:59:33 +0200},
151
+ biburl = {https://dblp.org/rec/journals/corr/abs-1907-11692.bib},
152
+ bibsource = {dblp computer science bibliography, https://dblp.org}
153
+ }
154
+
155
+
156
+ ```
157
+
158
+
159
+
160
+ # Glossary [optional]
161
+ More information needed
162
+
163
+ # More Information [optional]
164
+
165
+ More information needed
166
+
167
+ # Model Card Authors [optional]
168
+ Hamish Ivison in collaboration with Ezi Ozoani and the Hugging Face team
169
+
170
+ # Model Card Contact
171
+ More information needed
172
+
173
+ # How to Get Started with the Model
174
+
175
+ Use the code below to get started with the model.
176
+
177
+ <details>
178
+ <summary> Click to expand </summary>
179
+
180
+ ```python
181
+ from transformers import AutoTokenizer, AutoModel
182
+
183
+ tokenizer = AutoTokenizer.from_pretrained("hamishivi/fixed-distilroberta-base")
184
+
185
+ model = AutoModel.from_pretrained("hamishivi/fixed-distilroberta-base")
186
+
187
+ ```
188
+ </details>
189
+