jonathanagustin commited on
Commit
f841bde
1 Parent(s): 8ae6351

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +241 -85
README.md CHANGED
@@ -1,125 +1,281 @@
1
  ---
2
- tags:
3
- - generated_from_trainer
4
- datasets:
5
- - squad_v2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  model-index:
7
- - name: roberta-finetuned-squad_v2
8
  results:
9
  - task:
10
  type: question-answering
11
- name: Question Answering
12
  dataset:
13
  name: SQuAD v2
14
  type: squad_v2
15
- split: validation
16
  metrics:
17
- - type: exact
18
  value: 100.0
19
- name: Exact
20
- - type: f1
21
  value: 100.0
22
- name: F1
23
- - type: total
24
  value: 2
25
- name: Total
26
- - type: HasAns_exact
27
  value: 100.0
28
- name: Hasans_exact
29
- - type: HasAns_f1
30
  value: 100.0
31
- name: Hasans_f1
32
- - type: HasAns_total
33
  value: 2
34
- name: Hasans_total
35
- - type: best_exact
36
  value: 100.0
37
- name: Best_exact
38
- - type: best_exact_thresh
39
  value: 0.9603068232536316
40
- name: Best_exact_thresh
41
- - type: best_f1
42
  value: 100.0
43
- name: Best_f1
44
- - type: best_f1_thresh
45
  value: 0.9603068232536316
46
- name: Best_f1_thresh
47
- - type: total_time_in_seconds
48
  value: 0.036892927000735654
49
- name: Total_time_in_seconds
50
- - type: samples_per_second
51
  value: 54.21093316776193
52
- name: Samples_per_second
53
- - type: latency_in_seconds
54
  value: 0.018446463500367827
55
- name: Latency_in_seconds
56
  ---
57
 
58
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
59
- should probably proofread and complete it, then remove this comment. -->
60
 
61
- # roberta-finetuned-squad_v2
62
 
63
- This model was trained from scratch on the squad_v2 dataset.
64
- It achieves the following results on the evaluation set:
65
- - Loss: 0.8582
66
 
67
- ## Model description
68
 
69
- More information needed
70
 
71
- ## Intended uses & limitations
72
 
73
- More information needed
74
 
75
- ## Training and evaluation data
76
 
77
- More information needed
78
 
79
- ## Training procedure
 
 
 
 
 
80
 
81
- ### Training hyperparameters
82
 
83
- The following hyperparameters were used during training:
84
- - learning_rate: 2e-05
85
- - train_batch_size: 128
86
- - eval_batch_size: 128
87
- - seed: 42
88
- - gradient_accumulation_steps: 4
89
- - total_train_batch_size: 512
90
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
91
- - lr_scheduler_type: linear
92
- - num_epochs: 4
93
 
94
- ### Training results
 
 
95
 
96
- | Training Loss | Epoch | Step | Validation Loss |
97
- |:-------------:|:-----:|:----:|:---------------:|
98
- | 2.9129 | 0.2 | 100 | 1.4700 |
99
- | 1.4395 | 0.39 | 200 | 1.2407 |
100
- | 1.2356 | 0.59 | 300 | 1.0325 |
101
- | 1.1284 | 0.78 | 400 | 0.9750 |
102
- | 1.0821 | 0.98 | 500 | 0.9345 |
103
- | 0.9978 | 1.18 | 600 | 0.9893 |
104
- | 0.9697 | 1.37 | 700 | 0.9300 |
105
- | 0.9455 | 1.57 | 800 | 0.9351 |
106
- | 0.9322 | 1.76 | 900 | 0.9451 |
107
- | 0.9269 | 1.96 | 1000 | 0.9064 |
108
- | 0.9105 | 2.16 | 1100 | 0.8837 |
109
- | 0.8805 | 2.35 | 1200 | 0.8876 |
110
- | 0.8703 | 2.55 | 1300 | 0.9853 |
111
- | 0.8699 | 2.75 | 1400 | 0.9235 |
112
- | 0.8633 | 2.94 | 1500 | 0.8930 |
113
- | 0.828 | 3.14 | 1600 | 0.8582 |
114
- | 0.8284 | 3.33 | 1700 | 0.9203 |
115
- | 0.8076 | 3.53 | 1800 | 0.8866 |
116
- | 0.7805 | 3.73 | 1900 | 0.9099 |
117
- | 0.7974 | 3.92 | 2000 | 0.8746 |
118
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
119
 
120
- ### Framework versions
121
 
122
- - Transformers 4.34.1
123
- - Pytorch 2.1.0+cu118
124
- - Datasets 2.14.5
125
- - Tokenizers 0.14.1
 
1
  ---
2
+ language: en
3
+ license: mit
4
+ model_details: "\n ## Abstract\n This model, 'roberta-finetuned', is\
5
+ \ a question-answering chatbot trained on the SQuAD dataset, demonstrating competency\
6
+ \ in building conversational AI using recent advances in natural language processing.\
7
+ \ It utilizes a BERT model fine-tuned for extractive question answering.\n\n \
8
+ \ ## Data Collection and Preprocessing\n The model was trained on the\
9
+ \ Stanford Question Answering Dataset (SQuAD), which contains over 100,000 question-answer\
10
+ \ pairs based on Wikipedia articles. The data preprocessing involved tokenizing\
11
+ \ context paragraphs and questions, truncating sequences to fit BERT's max length,\
12
+ \ and adding special tokens to mark question and paragraph segments.\n\n \
13
+ \ ## Model Architecture and Training\n The architecture is based on the BERT\
14
+ \ transformer model, which was pretrained on large unlabeled text corpora. For this\
15
+ \ project, the BERT base model was fine-tuned on SQuAD for extractive question answering,\
16
+ \ with additional output layers for predicting the start and end indices of the\
17
+ \ answer span.\n\n ## SQuAD 2.0 Dataset\n SQuAD 2.0 combines the existing\
18
+ \ SQuAD data with over 50,000 unanswerable questions written adversarially by crowdworkers\
19
+ \ to look similar to answerable ones. This version of the dataset challenges models\
20
+ \ to not only produce answers when possible but also determine when no answer is\
21
+ \ supported by the paragraph and abstain from answering.\n "
22
+ intended_use: "\n - Answering questions from the squad_v2 dataset.\n \
23
+ \ - Developing question-answering systems within the scope of the aai520-project.\n\
24
+ \ - Research and experimentation in the NLP question-answering domain.\n\
25
+ \ "
26
+ limitations_and_bias: "\n The model inherits limitations and biases from the\
27
+ \ 'roberta-base' model, as it was trained on the same foundational data. \n \
28
+ \ It may underperform on questions that are ambiguous or too far outside the\
29
+ \ scope of the topics covered in the squad_v2 dataset. \n Additionally, the\
30
+ \ model may reflect societal biases present in its training data.\n "
31
+ ethical_considerations: "\n This model should not be used for making critical\
32
+ \ decisions without human oversight, \n as it can generate incorrect or biased\
33
+ \ answers, especially for topics not covered in the training data. \n Users\
34
+ \ should also consider the ethical implications of using AI in decision-making processes\
35
+ \ and the potential for perpetuating biases.\n "
36
+ evaluation: "\n The model was evaluated on the squad_v2 dataset using various\
37
+ \ metrics. These metrics, along with their corresponding scores, \n are detailed\
38
+ \ in the 'eval_results' section. The evaluation process ensured a comprehensive\
39
+ \ assessment of the model's performance \n in question-answering scenarios.\n\
40
+ \ "
41
+ training: "\n The model was trained over 4 epochs with a learning rate of 2e-05,\
42
+ \ using a batch size of 128. \n The training utilized a cross-entropy loss\
43
+ \ function and the AdamW optimizer, with gradient accumulation over 4 steps.\n \
44
+ \ "
45
+ tips_and_tricks: "\n For optimal performance, questions should be clear, concise,\
46
+ \ and grammatically correct. \n The model performs best on questions related\
47
+ \ to topics covered in the squad_v2 dataset. \n It is advisable to pre-process\
48
+ \ text for consistency in encoding and punctuation, and to manage expectations for\
49
+ \ questions on topics outside the training data.\n "
50
  model-index:
51
+ - name: roberta-finetuned
52
  results:
53
  - task:
54
  type: question-answering
 
55
  dataset:
56
  name: SQuAD v2
57
  type: squad_v2
 
58
  metrics:
59
+ - type: Exact
60
  value: 100.0
61
+ - type: F1
 
62
  value: 100.0
63
+ - type: Total
 
64
  value: 2
65
+ - type: Hasans Exact
 
66
  value: 100.0
67
+ - type: Hasans F1
 
68
  value: 100.0
69
+ - type: Hasans Total
 
70
  value: 2
71
+ - type: Best Exact
 
72
  value: 100.0
73
+ - type: Best Exact Thresh
 
74
  value: 0.9603068232536316
75
+ - type: Best F1
 
76
  value: 100.0
77
+ - type: Best F1 Thresh
 
78
  value: 0.9603068232536316
79
+ - type: Total Time In Seconds
 
80
  value: 0.036892927000735654
81
+ - type: Samples Per Second
 
82
  value: 54.21093316776193
83
+ - type: Latency In Seconds
 
84
  value: 0.018446463500367827
 
85
  ---
86
 
87
+ # Model Card for Model ID
 
88
 
89
+ <!-- Provide a quick summary of what the model is/does. -->
90
 
 
 
 
91
 
 
92
 
93
+ ## Model Details
94
 
95
+ ### Model Description
96
 
97
+ <!-- Provide a longer summary of what this model is. -->
98
 
 
99
 
 
100
 
101
+ - **Developed by:** [More Information Needed]
102
+ - **Shared by [optional]:** [More Information Needed]
103
+ - **Model type:** [More Information Needed]
104
+ - **Language(s) (NLP):** en
105
+ - **License:** mit
106
+ - **Finetuned from model [optional]:** [More Information Needed]
107
 
108
+ ### Model Sources [optional]
109
 
110
+ <!-- Provide the basic links for the model. -->
 
 
 
 
 
 
 
 
 
111
 
112
+ - **Repository:** [More Information Needed]
113
+ - **Paper [optional]:** [More Information Needed]
114
+ - **Demo [optional]:** [More Information Needed]
115
 
116
+ ## Uses
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
117
 
118
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
119
+
120
+ ### Direct Use
121
+
122
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
123
+
124
+ [More Information Needed]
125
+
126
+ ### Downstream Use [optional]
127
+
128
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
129
+
130
+ [More Information Needed]
131
+
132
+ ### Out-of-Scope Use
133
+
134
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
135
+
136
+ [More Information Needed]
137
+
138
+ ## Bias, Risks, and Limitations
139
+
140
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
141
+
142
+ [More Information Needed]
143
+
144
+ ### Recommendations
145
+
146
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
147
+
148
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
149
+
150
+ ## How to Get Started with the Model
151
+
152
+ Use the code below to get started with the model.
153
+
154
+ [More Information Needed]
155
+
156
+ ## Training Details
157
+
158
+ ### Training Data
159
+
160
+ <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
161
+
162
+ [More Information Needed]
163
+
164
+ ### Training Procedure
165
+
166
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
167
+
168
+ #### Preprocessing [optional]
169
+
170
+ [More Information Needed]
171
+
172
+
173
+ #### Training Hyperparameters
174
+
175
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
176
+
177
+ #### Speeds, Sizes, Times [optional]
178
+
179
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
180
+
181
+ [More Information Needed]
182
+
183
+ ## Evaluation
184
+
185
+ <!-- This section describes the evaluation protocols and provides the results. -->
186
+
187
+ ### Testing Data, Factors & Metrics
188
+
189
+ #### Testing Data
190
+
191
+ <!-- This should link to a Data Card if possible. -->
192
+
193
+ [More Information Needed]
194
+
195
+ #### Factors
196
+
197
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
198
+
199
+ [More Information Needed]
200
+
201
+ #### Metrics
202
+
203
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
204
+
205
+ [More Information Needed]
206
+
207
+ ### Results
208
+
209
+ [More Information Needed]
210
+
211
+ #### Summary
212
+
213
+
214
+
215
+ ## Model Examination [optional]
216
+
217
+ <!-- Relevant interpretability work for the model goes here -->
218
+
219
+ [More Information Needed]
220
+
221
+ ## Environmental Impact
222
+
223
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
224
+
225
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
226
+
227
+ - **Hardware Type:** [More Information Needed]
228
+ - **Hours used:** [More Information Needed]
229
+ - **Cloud Provider:** [More Information Needed]
230
+ - **Compute Region:** [More Information Needed]
231
+ - **Carbon Emitted:** [More Information Needed]
232
+
233
+ ## Technical Specifications [optional]
234
+
235
+ ### Model Architecture and Objective
236
+
237
+ [More Information Needed]
238
+
239
+ ### Compute Infrastructure
240
+
241
+ [More Information Needed]
242
+
243
+ #### Hardware
244
+
245
+ [More Information Needed]
246
+
247
+ #### Software
248
+
249
+ [More Information Needed]
250
+
251
+ ## Citation [optional]
252
+
253
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
254
+
255
+ **BibTeX:**
256
+
257
+ [More Information Needed]
258
+
259
+ **APA:**
260
+
261
+ [More Information Needed]
262
+
263
+ ## Glossary [optional]
264
+
265
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
266
+
267
+ [More Information Needed]
268
+
269
+ ## More Information [optional]
270
+
271
+ [More Information Needed]
272
+
273
+ ## Model Card Authors [optional]
274
+
275
+ [More Information Needed]
276
+
277
+ ## Model Card Contact
278
+
279
+ [More Information Needed]
280
 
 
281