lgrobol commited on
Commit
eeaaffa
1 Parent(s): 0c6f6c0

update to new dataset version

Browse files
README.md CHANGED
@@ -40,6 +40,11 @@ The training dataset consists of:
40
 
41
  These are obtained from the [OPUS](https://opus.nlpl.eu/) base (Tiedemann, 2012) and filtered using [OpusFilter](https://helsinki-nlp.github.io/OpusFilter) (Aulamo et al., 2020), see [`dl_opus.yaml`](dl_opus.yaml) for the details. The filtering is slightly non-deterministic due to the retraining of a statistical alignment model, but in my experience, different runs tend to give extremely similar results. Do not hesitate to reach out if you experience difficulties in using this to collect data.
42
 
 
 
 
 
 
43
  ## Training procedure
44
 
45
  The training hyperparameters are those suggested by Adelani et al. (2022) in their [code release](https://github.com/masakhane-io/lafand-mt), which gave their best results for machine translation of several African languages.
@@ -48,42 +53,57 @@ More specifically, we use the [example training script](https://github.com/huggi
48
 
49
  ```bash
50
  python run_translation.py \
51
- --model_name_or_path facebook/m2m100_418M \
52
- --do_train \
53
- --train_file {path_to_train_corpus} \
54
- --source_lang br \
55
- --target_lang fr \
56
- --output_dir {path_to_model} \
57
- --per_device_train_batch_size=4 \
58
- --per_device_eval_batch_size=4 \
59
- --overwrite_output_dir \
60
- --predict_with_generate \
61
- --forced_bos_token fr \
62
- --save_steps 50000 \
63
- --num_beams 10 \
64
  ```
65
 
66
  ### Training hyperparameters
67
 
68
  The following hyperparameters were used during training:
 
69
  - learning_rate: 5e-05
70
  - train_batch_size: 8
71
  - eval_batch_size: 8
72
  - seed: 42
73
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
74
  - lr_scheduler_type: linear
75
- - num_epochs: 3.0
76
 
77
  ### Framework versions
78
 
79
- - Transformers 4.23.1
80
  - Pytorch 1.12.1+cu116
81
  - Datasets 2.6.1
82
  - Tokenizers 0.13.1
83
 
84
  ## References
85
 
86
- - Adelani, David, Jesujoba Alabi, Angela Fan, Julia Kreutzer, Xiaoyu Shen, Machel Reid, Dana Ruiter, et al. 2022. « A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation ». In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 3053‑70. Seattle, United States: Association for Computational Linguistics. <https://doi.org/10.18653/v1/2022.naacl-main.223>.
87
- - Mikko Aulamo, Sami Virpioja, and Jörg Tiedemann. 2020. OpusFilter: A Configurable Parallel Corpus Filtering Toolbox. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 150–156, Online. Association for Computational Linguistics.
88
- - Tiedemann, Jorg 2012, Parallel Data, Tools and Interfaces in OPUS. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012)
89
- - Tyers, Francis M. 2009 "Rule-based augmentation of training data in Breton-French statistical machine translation ". Proceedings of the 13th Annual Conference of the European Association of Machine Translation, EAMT09. Barcelona, España. 213--218
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
  These are obtained from the [OPUS](https://opus.nlpl.eu/) base (Tiedemann, 2012) and filtered using [OpusFilter](https://helsinki-nlp.github.io/OpusFilter) (Aulamo et al., 2020), see [`dl_opus.yaml`](dl_opus.yaml) for the details. The filtering is slightly non-deterministic due to the retraining of a statistical alignment model, but in my experience, different runs tend to give extremely similar results. Do not hesitate to reach out if you experience difficulties in using this to collect data.
42
 
43
+ In addition to these, the training dataset also includes parallel br/fr sentences, provided as
44
+ glosses in the [Arbres](https://arbres.iker.cnrs.fr) wiki (Jouitteau, 2022), obtained from their
45
+ [ongoing port](https://github.com/Autogramm/Breton/commit/45ac2c444a979b7ee41e5f24a3bfd1ec39f09d7d)
46
+ to Universal Dependencies in the Autogramm project.
47
+
48
  ## Training procedure
49
 
50
  The training hyperparameters are those suggested by Adelani et al. (2022) in their [code release](https://github.com/masakhane-io/lafand-mt), which gave their best results for machine translation of several African languages.
 
53
 
54
  ```bash
55
  python run_translation.py \
56
+ --model_name_or_path facebook/m2m100_418M \
57
+ --do_train \
58
+ --train_file {path_to_training_data} \
59
+ --source_lang br \
60
+ --target_lang fr \
61
+ --output_dir {path_to_model}\
62
+ --per_device_train_batch_size=8 \
63
+ --overwrite_output_dir \
64
+ --forced_bos_token fr \
65
+ --save_steps 4096 \
66
+ --fp16 \
67
+ --num_train_epochs 4
68
+
69
  ```
70
 
71
  ### Training hyperparameters
72
 
73
  The following hyperparameters were used during training:
74
+
75
  - learning_rate: 5e-05
76
  - train_batch_size: 8
77
  - eval_batch_size: 8
78
  - seed: 42
79
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
80
  - lr_scheduler_type: linear
81
+ - num_epochs: 4.0
82
 
83
  ### Framework versions
84
 
85
+ - Transformers 4.24.0
86
  - Pytorch 1.12.1+cu116
87
  - Datasets 2.6.1
88
  - Tokenizers 0.13.1
89
 
90
  ## References
91
 
92
+ - Adelani, David, Jesujoba Alabi, Angela Fan, Julia Kreutzer, Xiaoyu Shen, Machel Reid, Dana Ruiter,
93
+ et al. 2022. « A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for
94
+ African News Translation ». In Proceedings of the 2022 Conference of the North American Chapter of
95
+ the Association for Computational Linguistics: Human Language Technologies, 3053‑70. Seattle,
96
+ United States: Association for Computational Linguistics.
97
+ <https://doi.org/10.18653/v1/2022.naacl-main.223>.
98
+ - Mikko Aulamo, Sami Virpioja, and Jörg Tiedemann. 2020. OpusFilter: A Configurable Parallel Corpus
99
+ Filtering Toolbox. In Proceedings of the 58th Annual Meeting of the Association for Computational
100
+ Linguistics: System Demonstrations, pages 150–156, Online. Association for Computational
101
+ Linguistics.
102
+ - Tiedemann, Jorg 2012, Parallel Data, Tools and Interfaces in OPUS. In Proceedings of the 8th
103
+ International Conference on Language Resources and Evaluation (LREC 2012)
104
+ - Jouitteau, Mélanie. (éd.). 2009-2022. ARBRES, wikigrammaire des dialectes du breton et centre de
105
+ ressources pour son étude linguistique formelle, IKER, CNRS, http://arbres.iker.cnrs.fr. Licence
106
+ Creative Commons BY-NC-SA.
107
+ - Tyers, Francis M. 2009 "Rule-based augmentation of training data in Breton-French statistical
108
+ machine translation ". Proceedings of the 13th Annual Conference of the European Association of
109
+ Machine Translation, EAMT09. Barcelona, España. 213--218
all_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "epoch": 3.0,
3
- "train_loss": 1.4955534830489579,
4
- "train_runtime": 15709.331,
5
- "train_samples": 48907,
6
- "train_samples_per_second": 9.34,
7
- "train_steps_per_second": 1.168
8
  }
 
1
  {
2
+ "epoch": 4.0,
3
+ "train_loss": 1.4005291703168083,
4
+ "train_runtime": 11994.4751,
5
+ "train_samples": 54393,
6
+ "train_samples_per_second": 18.139,
7
+ "train_steps_per_second": 1.134
8
  }
config.json CHANGED
@@ -32,7 +32,7 @@
32
  "pad_token_id": 1,
33
  "scale_embedding": true,
34
  "torch_dtype": "float32",
35
- "transformers_version": "4.23.1",
36
  "use_cache": true,
37
  "vocab_size": 128112
38
  }
 
32
  "pad_token_id": 1,
33
  "scale_embedding": true,
34
  "torch_dtype": "float32",
35
+ "transformers_version": "4.24.0",
36
  "use_cache": true,
37
  "vocab_size": 128112
38
  }
extract_sents.py ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import TextIO
2
+ import re
3
+
4
+ import click
5
+ import conllu
6
+ import jsonlines
7
+
8
+
9
+ @click.command(help="Extract a parallel corpus from a CoNLL-U file with translations")
10
+ @click.argument("conllu_path", type=click.File("r"))
11
+ @click.argument("output_path", type=click.File("w"), default="-")
12
+ @click.option("--main-langcode", default="br", show_default=True)
13
+ @click.option("--require-langcode", multiple=True, show_default=True)
14
+ def main(
15
+ conllu_path: TextIO,
16
+ main_langcode: str,
17
+ output_path: TextIO,
18
+ require_langcode: list[str],
19
+ ):
20
+ with jsonlines.Writer(output_path) as out_stream:
21
+ for tokenlist in conllu.parse_incr(conllu_path):
22
+ if m := re.match(r"'?(?P<content>[^/]+?)'?$", tokenlist.metadata["text"]):
23
+ main_text = m.group("content")
24
+ else:
25
+ continue
26
+ translations = {
27
+ km.group("langcode"): kv.group("content")
28
+ for k, v in tokenlist.metadata.items()
29
+ if (km := re.match(r"text_(?P<langcode>.*)", k))
30
+ and (kv := re.match(r"'?(?P<content>[^/]+?)'?$", v))
31
+ }
32
+ if not all(l in translations for l in require_langcode):
33
+ continue
34
+ out_stream.write(
35
+ {
36
+ "translation": {
37
+ main_langcode: main_text,
38
+ **translations,
39
+ }
40
+ }
41
+ )
42
+
43
+
44
+ if __name__ == "__main__":
45
+ main()
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7e6ca2e2048c0abf133d1ea4f986221f7744da591536e2c96c4aa7c8ad290d8e
3
- size 1935792071
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c1c8ae0171f992187869f7f6979a8762112705b6caa0404548a13cf039f8a5f1
3
+ size 1935795713
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "epoch": 3.0,
3
- "train_loss": 1.4955534830489579,
4
- "train_runtime": 15709.331,
5
- "train_samples": 48907,
6
- "train_samples_per_second": 9.34,
7
- "train_steps_per_second": 1.168
8
  }
 
1
  {
2
+ "epoch": 4.0,
3
+ "train_loss": 1.4005291703168083,
4
+ "train_runtime": 11994.4751,
5
+ "train_samples": 54393,
6
+ "train_samples_per_second": 18.139,
7
+ "train_steps_per_second": 1.134
8
  }
trainer_state.json CHANGED
@@ -1,241 +1,187 @@
1
  {
2
  "best_metric": null,
3
  "best_model_checkpoint": null,
4
- "epoch": 3.0,
5
- "global_step": 18342,
6
  "is_hyper_param_search": false,
7
  "is_local_process_zero": true,
8
  "is_world_process_zero": true,
9
  "log_history": [
10
  {
11
- "epoch": 0.08,
12
- "learning_rate": 4.863700795987351e-05,
13
- "loss": 2.8121,
14
  "step": 500
15
  },
16
  {
17
- "epoch": 0.16,
18
- "learning_rate": 4.7274015919747035e-05,
19
- "loss": 2.3962,
20
  "step": 1000
21
  },
22
  {
23
- "epoch": 0.25,
24
- "learning_rate": 4.5911023879620545e-05,
25
- "loss": 2.2275,
26
  "step": 1500
27
  },
28
  {
29
- "epoch": 0.33,
30
- "learning_rate": 4.454803183949406e-05,
31
- "loss": 2.1283,
32
  "step": 2000
33
  },
34
  {
35
- "epoch": 0.41,
36
- "learning_rate": 4.318503979936757e-05,
37
- "loss": 2.0421,
38
  "step": 2500
39
  },
40
  {
41
- "epoch": 0.49,
42
- "learning_rate": 4.182204775924109e-05,
43
- "loss": 1.9646,
44
  "step": 3000
45
  },
46
  {
47
- "epoch": 0.57,
48
- "learning_rate": 4.0459055719114605e-05,
49
- "loss": 1.9383,
50
  "step": 3500
51
  },
52
  {
53
- "epoch": 0.65,
54
- "learning_rate": 3.9096063678988114e-05,
55
- "loss": 1.8862,
56
  "step": 4000
57
  },
58
  {
59
- "epoch": 0.74,
60
- "learning_rate": 3.773307163886163e-05,
61
- "loss": 1.8524,
62
  "step": 4500
63
  },
64
  {
65
- "epoch": 0.82,
66
- "learning_rate": 3.637007959873515e-05,
67
- "loss": 1.8164,
68
  "step": 5000
69
  },
70
  {
71
- "epoch": 0.9,
72
- "learning_rate": 3.500708755860866e-05,
73
- "loss": 1.7536,
74
  "step": 5500
75
  },
76
  {
77
- "epoch": 0.98,
78
- "learning_rate": 3.3644095518482174e-05,
79
- "loss": 1.7631,
80
  "step": 6000
81
  },
82
  {
83
- "epoch": 1.06,
84
- "learning_rate": 3.228110347835569e-05,
85
- "loss": 1.4821,
86
  "step": 6500
87
  },
88
  {
89
- "epoch": 1.14,
90
- "learning_rate": 3.09181114382292e-05,
91
- "loss": 1.4501,
92
  "step": 7000
93
  },
94
  {
95
- "epoch": 1.23,
96
- "learning_rate": 2.9555119398102717e-05,
97
- "loss": 1.4364,
98
  "step": 7500
99
  },
100
  {
101
- "epoch": 1.31,
102
- "learning_rate": 2.8192127357976233e-05,
103
- "loss": 1.4028,
104
  "step": 8000
105
  },
106
  {
107
- "epoch": 1.39,
108
- "learning_rate": 2.6829135317849746e-05,
109
- "loss": 1.3865,
110
  "step": 8500
111
  },
112
  {
113
- "epoch": 1.47,
114
- "learning_rate": 2.546614327772326e-05,
115
- "loss": 1.4069,
116
  "step": 9000
117
  },
118
  {
119
- "epoch": 1.55,
120
- "learning_rate": 2.4103151237596773e-05,
121
- "loss": 1.3883,
122
  "step": 9500
123
  },
124
  {
125
- "epoch": 1.64,
126
- "learning_rate": 2.2740159197470286e-05,
127
- "loss": 1.3654,
128
  "step": 10000
129
  },
130
  {
131
- "epoch": 1.72,
132
- "learning_rate": 2.1377167157343802e-05,
133
- "loss": 1.3411,
134
  "step": 10500
135
  },
136
  {
137
- "epoch": 1.8,
138
- "learning_rate": 2.0014175117217316e-05,
139
- "loss": 1.3359,
140
  "step": 11000
141
  },
142
  {
143
- "epoch": 1.88,
144
- "learning_rate": 1.8651183077090832e-05,
145
- "loss": 1.379,
146
  "step": 11500
147
  },
148
  {
149
- "epoch": 1.96,
150
- "learning_rate": 1.7288191036964345e-05,
151
- "loss": 1.3316,
152
  "step": 12000
153
  },
154
  {
155
- "epoch": 2.04,
156
- "learning_rate": 1.592519899683786e-05,
157
- "loss": 1.2242,
158
  "step": 12500
159
  },
160
  {
161
- "epoch": 2.13,
162
- "learning_rate": 1.4562206956711375e-05,
163
- "loss": 1.0537,
164
  "step": 13000
165
  },
166
  {
167
- "epoch": 2.21,
168
- "learning_rate": 1.3199214916584887e-05,
169
- "loss": 1.0697,
170
  "step": 13500
171
  },
172
  {
173
- "epoch": 2.29,
174
- "learning_rate": 1.1836222876458403e-05,
175
- "loss": 1.0704,
176
- "step": 14000
177
- },
178
- {
179
- "epoch": 2.37,
180
- "learning_rate": 1.0473230836331916e-05,
181
- "loss": 1.0918,
182
- "step": 14500
183
- },
184
- {
185
- "epoch": 2.45,
186
- "learning_rate": 9.110238796205431e-06,
187
- "loss": 1.0878,
188
- "step": 15000
189
- },
190
- {
191
- "epoch": 2.54,
192
- "learning_rate": 7.747246756078944e-06,
193
- "loss": 1.0506,
194
- "step": 15500
195
- },
196
- {
197
- "epoch": 2.62,
198
- "learning_rate": 6.384254715952459e-06,
199
- "loss": 1.0557,
200
- "step": 16000
201
- },
202
- {
203
- "epoch": 2.7,
204
- "learning_rate": 5.021262675825973e-06,
205
- "loss": 1.0325,
206
- "step": 16500
207
- },
208
- {
209
- "epoch": 2.78,
210
- "learning_rate": 3.658270635699488e-06,
211
- "loss": 1.0784,
212
- "step": 17000
213
- },
214
- {
215
- "epoch": 2.86,
216
- "learning_rate": 2.295278595573002e-06,
217
- "loss": 1.0239,
218
- "step": 17500
219
- },
220
- {
221
- "epoch": 2.94,
222
- "learning_rate": 9.322865554465163e-07,
223
- "loss": 1.0211,
224
- "step": 18000
225
- },
226
- {
227
- "epoch": 3.0,
228
- "step": 18342,
229
- "total_flos": 2.148457555862323e+16,
230
- "train_loss": 1.4955534830489579,
231
- "train_runtime": 15709.331,
232
- "train_samples_per_second": 9.34,
233
- "train_steps_per_second": 1.168
234
  }
235
  ],
236
- "max_steps": 18342,
237
- "num_train_epochs": 3,
238
- "total_flos": 2.148457555862323e+16,
239
  "trial_name": null,
240
  "trial_params": null
241
  }
 
1
  {
2
  "best_metric": null,
3
  "best_model_checkpoint": null,
4
+ "epoch": 4.0,
5
+ "global_step": 13600,
6
  "is_hyper_param_search": false,
7
  "is_local_process_zero": true,
8
  "is_world_process_zero": true,
9
  "log_history": [
10
  {
11
+ "epoch": 0.15,
12
+ "learning_rate": 4.816176470588236e-05,
13
+ "loss": 2.6313,
14
  "step": 500
15
  },
16
  {
17
+ "epoch": 0.29,
18
+ "learning_rate": 4.632352941176471e-05,
19
+ "loss": 2.2069,
20
  "step": 1000
21
  },
22
  {
23
+ "epoch": 0.44,
24
+ "learning_rate": 4.448529411764706e-05,
25
+ "loss": 2.035,
26
  "step": 1500
27
  },
28
  {
29
+ "epoch": 0.59,
30
+ "learning_rate": 4.2647058823529415e-05,
31
+ "loss": 1.9491,
32
  "step": 2000
33
  },
34
  {
35
+ "epoch": 0.74,
36
+ "learning_rate": 4.08125e-05,
37
+ "loss": 1.8742,
38
  "step": 2500
39
  },
40
  {
41
+ "epoch": 0.88,
42
+ "learning_rate": 3.897426470588236e-05,
43
+ "loss": 1.8387,
44
  "step": 3000
45
  },
46
  {
47
+ "epoch": 1.03,
48
+ "learning_rate": 3.713602941176471e-05,
49
+ "loss": 1.6941,
50
  "step": 3500
51
  },
52
  {
53
+ "epoch": 1.18,
54
+ "learning_rate": 3.529779411764706e-05,
55
+ "loss": 1.5224,
56
  "step": 4000
57
  },
58
  {
59
+ "epoch": 1.32,
60
+ "learning_rate": 3.3459558823529415e-05,
61
+ "loss": 1.4897,
62
  "step": 4500
63
  },
64
  {
65
+ "epoch": 1.47,
66
+ "learning_rate": 3.1621323529411765e-05,
67
+ "loss": 1.4445,
68
  "step": 5000
69
  },
70
  {
71
+ "epoch": 1.62,
72
+ "learning_rate": 2.978308823529412e-05,
73
+ "loss": 1.4593,
74
  "step": 5500
75
  },
76
  {
77
+ "epoch": 1.76,
78
+ "learning_rate": 2.7944852941176468e-05,
79
+ "loss": 1.4251,
80
  "step": 6000
81
  },
82
  {
83
+ "epoch": 1.91,
84
+ "learning_rate": 2.6113970588235297e-05,
85
+ "loss": 1.39,
86
  "step": 6500
87
  },
88
  {
89
+ "epoch": 2.06,
90
+ "learning_rate": 2.427573529411765e-05,
91
+ "loss": 1.2959,
92
  "step": 7000
93
  },
94
  {
95
+ "epoch": 2.21,
96
+ "learning_rate": 2.24375e-05,
97
+ "loss": 1.1621,
98
  "step": 7500
99
  },
100
  {
101
+ "epoch": 2.35,
102
+ "learning_rate": 2.0599264705882353e-05,
103
+ "loss": 1.1374,
104
  "step": 8000
105
  },
106
  {
107
+ "epoch": 2.5,
108
+ "learning_rate": 1.876102941176471e-05,
109
+ "loss": 1.1649,
110
  "step": 8500
111
  },
112
  {
113
+ "epoch": 2.65,
114
+ "learning_rate": 1.6926470588235294e-05,
115
+ "loss": 1.1513,
116
  "step": 9000
117
  },
118
  {
119
+ "epoch": 2.79,
120
+ "learning_rate": 1.5088235294117647e-05,
121
+ "loss": 1.1463,
122
  "step": 9500
123
  },
124
  {
125
+ "epoch": 2.94,
126
+ "learning_rate": 1.3250000000000002e-05,
127
+ "loss": 1.1466,
128
  "step": 10000
129
  },
130
  {
131
+ "epoch": 3.09,
132
+ "learning_rate": 1.1411764705882353e-05,
133
+ "loss": 1.0411,
134
  "step": 10500
135
  },
136
  {
137
+ "epoch": 3.24,
138
+ "learning_rate": 9.573529411764706e-06,
139
+ "loss": 0.9581,
140
  "step": 11000
141
  },
142
  {
143
+ "epoch": 3.38,
144
+ "learning_rate": 7.735294117647058e-06,
145
+ "loss": 0.9514,
146
  "step": 11500
147
  },
148
  {
149
+ "epoch": 3.53,
150
+ "learning_rate": 5.897058823529412e-06,
151
+ "loss": 0.9429,
152
  "step": 12000
153
  },
154
  {
155
+ "epoch": 3.68,
156
+ "learning_rate": 4.058823529411765e-06,
157
+ "loss": 0.9676,
158
  "step": 12500
159
  },
160
  {
161
+ "epoch": 3.82,
162
+ "learning_rate": 2.2205882352941175e-06,
163
+ "loss": 0.9324,
164
  "step": 13000
165
  },
166
  {
167
+ "epoch": 3.97,
168
+ "learning_rate": 3.8235294117647064e-07,
169
+ "loss": 0.9555,
170
  "step": 13500
171
  },
172
  {
173
+ "epoch": 4.0,
174
+ "step": 13600,
175
+ "total_flos": 3.918346910230118e+16,
176
+ "train_loss": 1.4005291703168083,
177
+ "train_runtime": 11994.4751,
178
+ "train_samples_per_second": 18.139,
179
+ "train_steps_per_second": 1.134
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
180
  }
181
  ],
182
+ "max_steps": 13600,
183
+ "num_train_epochs": 4,
184
+ "total_flos": 3.918346910230118e+16,
185
  "trial_name": null,
186
  "trial_params": null
187
  }
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:906c9235c8bf75589426402d4a87903317a704f12dd138c3607b781101161aea
3
- size 3503
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d7e19c4b52c1665d4e24c8332861794cb0354d00704d62e085c5f3112b7d82d7
3
+ size 3579