thebogko commited on
Commit
8f1d478
1 Parent(s): a0fa44f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -66
README.md CHANGED
@@ -91,100 +91,89 @@ print(correct_sentence)
91
  ### Training Data
92
 
93
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 
 
 
 
 
94
 
95
- [More Information Needed]
 
 
96
 
97
  ### Training Procedure
98
 
99
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
100
-
101
- #### Preprocessing [optional]
102
-
103
- [More Information Needed]
104
-
105
 
106
  #### Training Hyperparameters
107
 
108
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
 
 
 
109
 
110
- #### Speeds, Sizes, Times [optional]
111
-
112
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
113
-
114
- [More Information Needed]
115
 
116
  ## Evaluation
117
 
118
  <!-- This section describes the evaluation protocols and provides the results. -->
 
 
 
 
119
 
120
  ### Testing Data, Factors & Metrics
121
 
122
  #### Testing Data
123
 
124
- <!-- This should link to a Dataset Card if possible. -->
125
-
126
- [More Information Needed]
127
-
128
- #### Factors
129
-
130
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
131
-
132
- [More Information Needed]
133
 
134
  #### Metrics
135
 
136
  <!-- These are the evaluation metrics being used, ideally with a description of why. -->
137
-
138
- [More Information Needed]
139
 
140
  ### Results
141
 
142
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
143
 
144
  #### Summary
145
 
 
146
 
147
-
148
- ## Model Examination [optional]
149
-
150
- <!-- Relevant interpretability work for the model goes here -->
151
-
152
- [More Information Needed]
153
-
154
- ## Environmental Impact
155
-
156
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
157
-
158
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
159
-
160
- - **Hardware Type:** [More Information Needed]
161
- - **Hours used:** [More Information Needed]
162
- - **Cloud Provider:** [More Information Needed]
163
- - **Compute Region:** [More Information Needed]
164
- - **Carbon Emitted:** [More Information Needed]
165
-
166
- ## Technical Specifications [optional]
167
-
168
- ### Model Architecture and Objective
169
-
170
- [More Information Needed]
171
-
172
- ### Compute Infrastructure
173
-
174
- [More Information Needed]
175
-
176
- #### Hardware
177
-
178
- [More Information Needed]
179
-
180
- #### Software
181
-
182
- [More Information Needed]
183
-
184
  ## Citation [optional]
185
 
186
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
187
-
188
  **BibTeX:**
189
 
190
  [More Information Needed]
@@ -195,8 +184,6 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
195
 
196
  ## Glossary [optional]
197
 
198
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
199
-
200
  [More Information Needed]
201
 
202
  ## More Information [optional]
@@ -209,4 +196,4 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
209
 
210
  ## Model Card Contact
211
 
212
- [More Information Needed]
 
91
  ### Training Data
92
 
93
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
94
+ The training data used is from a collection of [Bulgarian grammar mistakes](https://huggingface.co/datasets/thebogko/bulgarian-grammar-mistakes), which contains 7.59k rows of data, spanning over four different types of grammar errors:
95
+ 1) **Misuse of articles**
96
+ 2) **Misuse of pronouns**
97
+ 3) Incorrect appending of 'me' for plural verbs in the first person
98
+ 4) Word disagreement between nouns and adjectives in terms of grammatical gender and number
99
 
100
+ Only the first two were used in the fine-tuning of this model, as the rationale was that these two types of errors are much more common overall (especially with native Bulgarian speakers), and it would allow the model to focus on these.
101
+
102
+ After filtering only these two types we are left with 3090 pairs, which were then split into training/validation/test (72/18/10), respectively. With this split we are left with 2224 training pairs.
103
 
104
  ### Training Procedure
105
 
106
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
107
+ The standard fine-tuning training procedure was applied by creating batches from the training samples and eavluating on each epoch. The model weights are optimised using cross-entropy loss.
 
 
 
 
108
 
109
  #### Training Hyperparameters
110
 
111
+ Gridspace search was applied to find the best learning rate, epoch number, weight decay and batch size. The chosen setup at the end of expimentation stage was chosen to be:
112
+ 1) **batch_size**: 4
113
+ 2) **learning_rate**: 0.0002
114
+ 3) **wight_decay**: 0.001
115
+ 4) **epoch number**: 4
116
 
117
+ This gridspace search was performed 3 separate times, and it resulted in the lowest avearge validation loss of 0.01431.
 
 
 
 
118
 
119
  ## Evaluation
120
 
121
  <!-- This section describes the evaluation protocols and provides the results. -->
122
+ Evaluation was performed against four other models:
123
+ - bespoke RNN encoder-decoder model with attention
124
+ - [gpt3.5 Turbo model](https://platform.openai.com/docs/models/gpt-3-5-turbo) by [OpenAI](https://openai.com)
125
+ - [BgGPT model](https://huggingface.co/INSAIT-Institute/BgGPT-7B-Instruct-v0.1) by [INSAIT](https://insait.ai)
126
 
127
  ### Testing Data, Factors & Metrics
128
 
129
  #### Testing Data
130
 
131
+ The testing data is 309 pairs, from the original train/validation/test split of (72/18/10) over 3090 pairs.
 
 
 
 
 
 
 
 
132
 
133
  #### Metrics
134
 
135
  <!-- These are the evaluation metrics being used, ideally with a description of why. -->
136
+ Evaluated using recall, precision, f1 score, f0.5 score and BLEU.
 
137
 
138
  ### Results
139
 
140
+ The resuls are averaged over the testing pairs.
141
+
142
+ **mt5-base finetuned bulgarian-grammar-mistakes**:
143
+ - precision: **0.6812**
144
+ - recall: **0.6861**
145
+ - f1 score: **0.6828**
146
+ - f0.5 score: **0.6818**
147
+ - BLEU: **0.9623**
148
+
149
+ **gpt3.5 Turbo**
150
+ - precision: 0.3751
151
+ - recall: 0.6052
152
+ - f1 score: 0.4331
153
+ - f0.5 score: 0.3934
154
+ - BLEU: 0.7666
155
+
156
+ **BgGPT**
157
+ - precision: 0.3307
158
+ - recall: 0.5987
159
+ - f1 score: 0.3934
160
+ - f0.5 score: 0.3503
161
+ - BLEU: 0.7110
162
+
163
+ **RNN encoder-decoder model with attention**
164
+ - precision: 0.1717
165
+ - recall: 0.2362
166
+ - f1 score: 0.1820
167
+ - f0.5 score: 0.1748
168
+ - BLEU: 0.2087
169
 
170
  #### Summary
171
 
172
+ The evaluation showcases that the fine-tuned model ourperforms all other models across the chosen metrics, particularly precision. This implies that the model's strength lies in being able to ensure that the corrections it makes are, in fact, valid, as opposed to the other models, all of which exhibit a recall value that's much higher than their respecrive precision.
173
 
174
+ <!--
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
175
  ## Citation [optional]
176
 
 
 
177
  **BibTeX:**
178
 
179
  [More Information Needed]
 
184
 
185
  ## Glossary [optional]
186
 
 
 
187
  [More Information Needed]
188
 
189
  ## More Information [optional]
 
196
 
197
  ## Model Card Contact
198
 
199
+ [More Information Needed]-->