qanastek commited on
Commit
55be9d0
1 Parent(s): 8d66d73

First commit

Browse files
.gitattributes CHANGED
@@ -1,6 +1,7 @@
1
  *.7z filter=lfs diff=lfs merge=lfs -text
2
  *.arrow filter=lfs diff=lfs merge=lfs -text
3
  *.bin filter=lfs diff=lfs merge=lfs -text
 
4
  *.bz2 filter=lfs diff=lfs merge=lfs -text
5
  *.ftz filter=lfs diff=lfs merge=lfs -text
6
  *.gz filter=lfs diff=lfs merge=lfs -text
@@ -16,11 +17,10 @@
16
  *.pt filter=lfs diff=lfs merge=lfs -text
17
  *.pth filter=lfs diff=lfs merge=lfs -text
18
  *.rar filter=lfs diff=lfs merge=lfs -text
19
- saved_model/**/* filter=lfs diff=lfs merge=lfs -text
20
  *.tar.* filter=lfs diff=lfs merge=lfs -text
21
  *.tflite filter=lfs diff=lfs merge=lfs -text
22
  *.tgz filter=lfs diff=lfs merge=lfs -text
23
- *.wasm filter=lfs diff=lfs merge=lfs -text
24
  *.xz filter=lfs diff=lfs merge=lfs -text
25
  *.zip filter=lfs diff=lfs merge=lfs -text
26
  *.zstandard filter=lfs diff=lfs merge=lfs -text
1
  *.7z filter=lfs diff=lfs merge=lfs -text
2
  *.arrow filter=lfs diff=lfs merge=lfs -text
3
  *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bin.* filter=lfs diff=lfs merge=lfs -text
5
  *.bz2 filter=lfs diff=lfs merge=lfs -text
6
  *.ftz filter=lfs diff=lfs merge=lfs -text
7
  *.gz filter=lfs diff=lfs merge=lfs -text
17
  *.pt filter=lfs diff=lfs merge=lfs -text
18
  *.pth filter=lfs diff=lfs merge=lfs -text
19
  *.rar filter=lfs diff=lfs merge=lfs -text
20
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
21
  *.tar.* filter=lfs diff=lfs merge=lfs -text
22
  *.tflite filter=lfs diff=lfs merge=lfs -text
23
  *.tgz filter=lfs diff=lfs merge=lfs -text
 
24
  *.xz filter=lfs diff=lfs merge=lfs -text
25
  *.zip filter=lfs diff=lfs merge=lfs -text
26
  *.zstandard filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,255 @@
1
  ---
2
- license: cc-by-sa-4.0
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags:
3
+ - Transformers
4
+ - token-classification
5
+ - sequence-tagger-model
6
+ language: fr
7
+ datasets:
8
+ - qanastek/ANTILLES
9
+ widget:
10
+ - text: "George Washington est allé à Washington"
11
  ---
12
+
13
+ # POET: A French Extended Part-of-Speech Tagger
14
+
15
+ - Corpora: [ANTILLES](https://github.com/qanastek/ANTILLES)
16
+ - Embeddings & Sequence Labelling: [CamemBERT](https://arxiv.org/abs/1911.03894)
17
+ - Number of Epochs: 115
18
+
19
+ **People Involved**
20
+
21
+ * [LABRAK Yanis](https://www.linkedin.com/in/yanis-labrak-8a7412145/) (1)
22
+ * [DUFOUR Richard](https://cv.archives-ouvertes.fr/richard-dufour) (2)
23
+
24
+ **Affiliations**
25
+
26
+ 1. [LIA, NLP team](https://lia.univ-avignon.fr/), Avignon University, Avignon, France.
27
+ 2. [LS2N, TALN team](https://www.ls2n.fr/equipe/taln/), Nantes University, Nantes, France.
28
+
29
+ ## Demo: How to use in HuggingFace Transformers
30
+
31
+ Requires [transformers](https://pypi.org/project/transformers/): ```pip install transformers```
32
+
33
+ ```python
34
+ from transformers import CamembertTokenizer, CamembertForTokenClassification, TokenClassificationPipeline
35
+
36
+ tokenizer = CamembertTokenizer.from_pretrained('qanastek/pos-french-camembert')
37
+ model = CamembertForTokenClassification.from_pretrained('qanastek/pos-french-camembert')
38
+ pos = TokenClassificationPipeline(model=model, tokenizer=tokenizer)
39
+
40
+ def make_prediction(sentence):
41
+ labels = [l['entity'] for l in pos(sentence)]
42
+ return list(zip(sentence.split(" "), labels))
43
+
44
+ res = make_prediction("George Washington est allé à Washington")
45
+ ```
46
+
47
+ Output:
48
+
49
+ ![Preview Output](preview.PNG)
50
+
51
+ ## Training data
52
+
53
+ `ANTILLES` is a part-of-speech tagging corpora based on [UD_French-GSD](https://universaldependencies.org/treebanks/fr_gsd/index.html) which was originally created in 2015 and is based on the [universal dependency treebank v2.0](https://github.com/ryanmcd/uni-dep-tb).
54
+
55
+ Originally, the corpora consists of 400,399 words (16,341 sentences) and had 17 different classes. Now, after applying our tags augmentation we obtain 60 different classes which add linguistic and semantic information such as the gender, number, mood, person, tense or verb form given in the different CoNLL-03 fields from the original corpora.
56
+
57
+ We based our tags on the level of details given by the [LIA_TAGG](http://pageperso.lif.univ-mrs.fr/frederic.bechet/download.html) statistical POS tagger written by [Frédéric Béchet](http://pageperso.lif.univ-mrs.fr/frederic.bechet/index-english.html) in 2001.
58
+
59
+ The corpora used for this model is available on [Github](https://github.com/qanastek/ANTILLES) at the [CoNLL-U format](https://universaldependencies.org/format.html).
60
+
61
+ Training data are fed to the model as free language and doesn't pass a normalization phase. Thus, it's made the model case and punctuation sensitive.
62
+
63
+ ## Original Tags
64
+
65
+ ```plain
66
+ PRON VERB SCONJ ADP CCONJ DET NOUN ADJ AUX ADV PUNCT PROPN NUM SYM PART X INTJ
67
+ ```
68
+
69
+ ## New additional POS tags
70
+
71
+ | Abbreviation | Description | Examples |
72
+ |:--------:|:--------:|:--------:|
73
+ | PREP | Preposition | de |
74
+ | AUX | Auxiliary Verb | est |
75
+ | ADV | Adverb | toujours |
76
+ | COSUB | Subordinating conjunction | que |
77
+ | COCO | Coordinating Conjunction | et |
78
+ | PART | Demonstrative particle | -t |
79
+ | PRON | Pronoun | qui ce quoi |
80
+ | PDEMMS | Demonstrative Pronoun - Singular Masculine | ce |
81
+ | PDEMMP | Demonstrative Pronoun - Plural Masculine | ceux |
82
+ | PDEMFS | Demonstrative Pronoun - Singular Feminine | cette |
83
+ | PDEMFP | Demonstrative Pronoun - Plural Feminine | celles |
84
+ | PINDMS | Indefinite Pronoun - Singular Masculine | tout |
85
+ | PINDMP | Indefinite Pronoun - Plural Masculine | autres |
86
+ | PINDFS | Indefinite Pronoun - Singular Feminine | chacune |
87
+ | PINDFP | Indefinite Pronoun - Plural Feminine | certaines |
88
+ | PROPN | Proper noun | Houston |
89
+ | XFAMIL | Last name | Levy |
90
+ | NUM | Numerical Adjective | trentaine vingtaine |
91
+ | DINTMS | Masculine Numerical Adjective | un |
92
+ | DINTFS | Feminine Numerical Adjective | une |
93
+ | PPOBJMS | Pronoun complements of objects - Singular Masculine | le lui |
94
+ | PPOBJMP | Pronoun complements of objects - Plural Masculine | eux y |
95
+ | PPOBJFS | Pronoun complements of objects - Singular Feminine | moi la |
96
+ | PPOBJFP | Pronoun complements of objects - Plural Feminine | en y |
97
+ | PPER1S | Personal Pronoun First-Person - Singular | je |
98
+ | PPER2S | Personal Pronoun Second-Person - Singular | tu |
99
+ | PPER3MS | Personal Pronoun Third-Person - Singular Masculine | il |
100
+ | PPER3MP | Personal Pronoun Third-Person - Plural Masculine | ils |
101
+ | PPER3FS | Personal Pronoun Third-Person - Singular Feminine | elle |
102
+ | PPER3FP | Personal Pronoun Third-Person - Plural Feminine | elles |
103
+ | PREFS | Reflexive Pronoun First-Person - Singular | me m' |
104
+ | PREF | Reflexive Pronoun Third-Person - Singular | se s' |
105
+ | PREFP | Reflexive Pronoun First / Second-Person - Plural | nous vous |
106
+ | VERB | Verb | obtient |
107
+ | VPPMS | Past Participle - Singular Masculine | formulé |
108
+ | VPPMP | Past Participle - Plural Masculine | classés |
109
+ | VPPFS | Past Participle - Singular Feminine | appelée |
110
+ | VPPFP | Past Participle - Plural Feminine | sanctionnées |
111
+ | DET | Determinant | les l' |
112
+ | DETMS | Determinant - Singular Masculine | les |
113
+ | DETFS | Determinant - Singular Feminine | la |
114
+ | ADJ | Adjective | capable sérieux |
115
+ | ADJMS | Adjective - Singular Masculine | grand important |
116
+ | ADJMP | Adjective - Plural Masculine | grands petits |
117
+ | ADJFS | Adjective - Singular Feminine | française petite |
118
+ | ADJFP | Adjective - Plural Feminine | légères petites |
119
+ | NOUN | Noun | temps |
120
+ | NMS | Noun - Singular Masculine | drapeau |
121
+ | NMP | Noun - Plural Masculine | journalistes |
122
+ | NFS | Noun - Singular Feminine | tête |
123
+ | NFP | Noun - Plural Feminine | ondes |
124
+ | PREL | Relative Pronoun | qui dont |
125
+ | PRELMS | Relative Pronoun - Singular Masculine | lequel |
126
+ | PRELMP | Relative Pronoun - Plural Masculine | lesquels |
127
+ | PRELFS | Relative Pronoun - Singular Feminine | laquelle |
128
+ | PRELFP | Relative Pronoun - Plural Feminine | lesquelles |
129
+ | INTJ | Interjection | merci bref |
130
+ | CHIF | Numbers | 1979 10 |
131
+ | SYM | Symbol | € % |
132
+ | YPFOR | Endpoint | . |
133
+ | PUNCT | Ponctuation | : , |
134
+ | MOTINC | Unknown words | Technology Lady |
135
+ | X | Typos & others | sfeir 3D statu |
136
+
137
+ ## Evaluation results
138
+
139
+ The test corpora used for this evaluation is available on [Github](https://github.com/qanastek/ANTILLES/blob/main/ANTILLES/test.conllu).
140
+
141
+ ```plain
142
+ precision recall f1-score support
143
+
144
+ ADJ 0.9040 0.8828 0.8933 128
145
+ ADJFP 0.9811 0.9585 0.9697 434
146
+ ADJFS 0.9606 0.9826 0.9715 918
147
+ ADJMP 0.9613 0.9357 0.9483 451
148
+ ADJMS 0.9561 0.9611 0.9586 952
149
+ ADV 0.9870 0.9948 0.9908 1524
150
+ AUX 0.9956 0.9964 0.9960 1124
151
+ CHIF 0.9798 0.9774 0.9786 1239
152
+ COCO 1.0000 0.9989 0.9994 884
153
+ COSUB 0.9939 0.9939 0.9939 328
154
+ DET 0.9972 0.9972 0.9972 2897
155
+ DETFS 0.9990 1.0000 0.9995 1007
156
+ DETMS 1.0000 0.9993 0.9996 1426
157
+ DINTFS 0.9967 0.9902 0.9934 306
158
+ DINTMS 0.9923 0.9948 0.9935 387
159
+ INTJ 0.8000 0.8000 0.8000 5
160
+ MOTINC 0.5049 0.5827 0.5410 266
161
+ NFP 0.9807 0.9675 0.9740 892
162
+ NFS 0.9778 0.9699 0.9738 2588
163
+ NMP 0.9687 0.9495 0.9590 1367
164
+ NMS 0.9759 0.9560 0.9659 3181
165
+ NOUN 0.6164 0.8673 0.7206 113
166
+ NUM 0.6250 0.8333 0.7143 6
167
+ PART 1.0000 0.9375 0.9677 16
168
+ PDEMFP 1.0000 1.0000 1.0000 3
169
+ PDEMFS 1.0000 1.0000 1.0000 89
170
+ PDEMMP 1.0000 1.0000 1.0000 20
171
+ PDEMMS 1.0000 1.0000 1.0000 222
172
+ PINDFP 1.0000 1.0000 1.0000 3
173
+ PINDFS 0.8571 1.0000 0.9231 12
174
+ PINDMP 0.9000 1.0000 0.9474 9
175
+ PINDMS 0.9286 0.9701 0.9489 67
176
+ PINTFS 0.0000 0.0000 0.0000 2
177
+ PPER1S 1.0000 1.0000 1.0000 62
178
+ PPER2S 0.7500 1.0000 0.8571 3
179
+ PPER3FP 1.0000 1.0000 1.0000 9
180
+ PPER3FS 1.0000 1.0000 1.0000 96
181
+ PPER3MP 1.0000 1.0000 1.0000 31
182
+ PPER3MS 1.0000 1.0000 1.0000 377
183
+ PPOBJFP 1.0000 0.7500 0.8571 4
184
+ PPOBJFS 0.9167 0.8919 0.9041 37
185
+ PPOBJMP 0.7500 0.7500 0.7500 12
186
+ PPOBJMS 0.9371 0.9640 0.9504 139
187
+ PREF 1.0000 1.0000 1.0000 332
188
+ PREFP 1.0000 1.0000 1.0000 64
189
+ PREFS 1.0000 1.0000 1.0000 13
190
+ PREL 0.9964 0.9964 0.9964 277
191
+ PRELFP 1.0000 1.0000 1.0000 5
192
+ PRELFS 0.8000 1.0000 0.8889 4
193
+ PRELMP 1.0000 1.0000 1.0000 3
194
+ PRELMS 1.0000 1.0000 1.0000 11
195
+ PREP 0.9971 0.9977 0.9974 6161
196
+ PRON 0.9836 0.9836 0.9836 61
197
+ PROPN 0.9468 0.9503 0.9486 4310
198
+ PUNCT 1.0000 1.0000 1.0000 4019
199
+ SYM 0.9394 0.8158 0.8732 76
200
+ VERB 0.9956 0.9921 0.9938 2273
201
+ VPPFP 0.9145 0.9469 0.9304 113
202
+ VPPFS 0.9562 0.9597 0.9580 273
203
+ VPPMP 0.8827 0.9728 0.9256 147
204
+ VPPMS 0.9778 0.9794 0.9786 630
205
+ VPPRE 0.0000 0.0000 0.0000 1
206
+ X 0.9604 0.9935 0.9766 1073
207
+ XFAMIL 0.9386 0.9113 0.9248 1342
208
+ YPFOR 1.0000 1.0000 1.0000 2750
209
+
210
+ accuracy 0.9778 47574
211
+ macro avg 0.9151 0.9285 0.9202 47574
212
+ weighted avg 0.9785 0.9778 0.9780 47574
213
+ ```
214
+
215
+ ## BibTeX Citations
216
+
217
+ Please cite the following paper when using this model.
218
+
219
+ UD_French-GSD corpora:
220
+
221
+ ```latex
222
+ @misc{
223
+ universaldependencies,
224
+ title={UniversalDependencies/UD_French-GSD},
225
+ url={https://github.com/UniversalDependencies/UD_French-GSD}, journal={GitHub},
226
+ author={UniversalDependencies}
227
+ }
228
+ ```
229
+
230
+ LIA TAGG:
231
+
232
+ ```latex
233
+ @techreport{LIA_TAGG,
234
+ author = {Frédéric Béchet},
235
+ title = {LIA_TAGG: a statistical POS tagger + syntactic bracketer},
236
+ institution = {Aix-Marseille University & CNRS},
237
+ year = {2001}
238
+ }
239
+ ```
240
+
241
+ Flair Embeddings:
242
+
243
+ ```latex
244
+ @inproceedings{akbik2018coling,
245
+ title={Contextual String Embeddings for Sequence Labeling},
246
+ author={Akbik, Alan and Blythe, Duncan and Vollgraf, Roland},
247
+ booktitle = {{COLING} 2018, 27th International Conference on Computational Linguistics},
248
+ pages = {1638--1649},
249
+ year = {2018}
250
+ }
251
+ ```
252
+
253
+ ## Acknowledgment
254
+
255
+ This work was financially supported by [Zenidoc](https://zenidoc.fr/)
config.json ADDED
@@ -0,0 +1,162 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "camembert-base",
3
+ "architectures": [
4
+ "CamembertForTokenClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 5,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 6,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
+ "id2label": {
14
+ "0": "PART",
15
+ "1": "PDEMMP",
16
+ "10": "NOUN",
17
+ "11": "PPER3MS",
18
+ "12": "AUX",
19
+ "13": "COSUB",
20
+ "14": "ADJ",
21
+ "15": "VPPRE",
22
+ "16": "COCO",
23
+ "17": "ADJMP",
24
+ "18": "X",
25
+ "19": "NMS",
26
+ "2": "PREFS",
27
+ "20": "PINDMS",
28
+ "21": "DETFS",
29
+ "22": "PPER2S",
30
+ "23": "PREFP",
31
+ "24": "PPER3MP",
32
+ "25": "PRELMP",
33
+ "26": "PINDFS",
34
+ "27": "PRON",
35
+ "28": "PREP",
36
+ "29": "PPOBJMP",
37
+ "3": "PINDMP",
38
+ "30": "ADJFS",
39
+ "31": "DET",
40
+ "32": "ADJFP",
41
+ "33": "PDEMFP",
42
+ "34": "PREL",
43
+ "35": "PPER3FS",
44
+ "36": "VPPFS",
45
+ "37": "PPER3FP",
46
+ "38": "CHIF",
47
+ "39": "NMP",
48
+ "4": "DINTMS",
49
+ "40": "SYM",
50
+ "41": "NFS",
51
+ "42": "VERB",
52
+ "43": "PREF",
53
+ "44": "VPPFP",
54
+ "45": "PDEMMS",
55
+ "46": "XFAMIL",
56
+ "47": "PINDFP",
57
+ "48": "VPPMP",
58
+ "49": "YPFOR",
59
+ "5": "NUM",
60
+ "50": "ADV",
61
+ "51": "PRELFS",
62
+ "52": "DINTFS",
63
+ "53": "DETMS",
64
+ "54": "PPOBJFP",
65
+ "55": "PPOBJMS",
66
+ "56": "VPPMS",
67
+ "57": "INTJ",
68
+ "58": "PROPN",
69
+ "59": "PDEMFS",
70
+ "6": "PINTFS",
71
+ "60": "PPER1S",
72
+ "61": "PRELFP",
73
+ "62": "MOTINC",
74
+ "63": "ADJMS",
75
+ "64": "PPOBJFS",
76
+ "7": "NFP",
77
+ "8": "PUNCT",
78
+ "9": "PRELMS"
79
+ },
80
+ "initializer_range": 0.02,
81
+ "intermediate_size": 3072,
82
+ "label2id": {
83
+ "ADJ": "14",
84
+ "ADJFP": "32",
85
+ "ADJFS": "30",
86
+ "ADJMP": "17",
87
+ "ADJMS": "63",
88
+ "ADV": "50",
89
+ "AUX": "12",
90
+ "CHIF": "38",
91
+ "COCO": "16",
92
+ "COSUB": "13",
93
+ "DET": "31",
94
+ "DETFS": "21",
95
+ "DETMS": "53",
96
+ "DINTFS": "52",
97
+ "DINTMS": "4",
98
+ "INTJ": "57",
99
+ "MOTINC": "62",
100
+ "NFP": "7",
101
+ "NFS": "41",
102
+ "NMP": "39",
103
+ "NMS": "19",
104
+ "NOUN": "10",
105
+ "NUM": "5",
106
+ "PART": "0",
107
+ "PDEMFP": "33",
108
+ "PDEMFS": "59",
109
+ "PDEMMP": "1",
110
+ "PDEMMS": "45",
111
+ "PINDFP": "47",
112
+ "PINDFS": "26",
113
+ "PINDMP": "3",
114
+ "PINDMS": "20",
115
+ "PINTFS": "6",
116
+ "PPER1S": "60",
117
+ "PPER2S": "22",
118
+ "PPER3FP": "37",
119
+ "PPER3FS": "35",
120
+ "PPER3MP": "24",
121
+ "PPER3MS": "11",
122
+ "PPOBJFP": "54",
123
+ "PPOBJFS": "64",
124
+ "PPOBJMP": "29",
125
+ "PPOBJMS": "55",
126
+ "PREF": "43",
127
+ "PREFP": "23",
128
+ "PREFS": "2",
129
+ "PREL": "34",
130
+ "PRELFP": "61",
131
+ "PRELFS": "51",
132
+ "PRELMP": "25",
133
+ "PRELMS": "9",
134
+ "PREP": "28",
135
+ "PRON": "27",
136
+ "PROPN": "58",
137
+ "PUNCT": "8",
138
+ "SYM": "40",
139
+ "VERB": "42",
140
+ "VPPFP": "44",
141
+ "VPPFS": "36",
142
+ "VPPMP": "48",
143
+ "VPPMS": "56",
144
+ "VPPRE": "15",
145
+ "X": "18",
146
+ "XFAMIL": "46",
147
+ "YPFOR": "49"
148
+ },
149
+ "layer_norm_eps": 1e-05,
150
+ "max_position_embeddings": 514,
151
+ "model_type": "camembert",
152
+ "num_attention_heads": 12,
153
+ "num_hidden_layers": 12,
154
+ "output_past": true,
155
+ "pad_token_id": 1,
156
+ "position_embedding_type": "absolute",
157
+ "torch_dtype": "float32",
158
+ "transformers_version": "4.12.5",
159
+ "type_vocab_size": 1,
160
+ "use_cache": true,
161
+ "vocab_size": 32005
162
+ }
logs/logs.txt ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ precision recall f1-score support
2
+
3
+ ADJ 0.9040 0.8828 0.8933 128
4
+ ADJFP 0.9811 0.9585 0.9697 434
5
+ ADJFS 0.9606 0.9826 0.9715 918
6
+ ADJMP 0.9613 0.9357 0.9483 451
7
+ ADJMS 0.9561 0.9611 0.9586 952
8
+ ADV 0.9870 0.9948 0.9908 1524
9
+ AUX 0.9956 0.9964 0.9960 1124
10
+ CHIF 0.9798 0.9774 0.9786 1239
11
+ COCO 1.0000 0.9989 0.9994 884
12
+ COSUB 0.9939 0.9939 0.9939 328
13
+ DET 0.9972 0.9972 0.9972 2897
14
+ DETFS 0.9990 1.0000 0.9995 1007
15
+ DETMS 1.0000 0.9993 0.9996 1426
16
+ DINTFS 0.9967 0.9902 0.9934 306
17
+ DINTMS 0.9923 0.9948 0.9935 387
18
+ INTJ 0.8000 0.8000 0.8000 5
19
+ MOTINC 0.5049 0.5827 0.5410 266
20
+ NFP 0.9807 0.9675 0.9740 892
21
+ NFS 0.9778 0.9699 0.9738 2588
22
+ NMP 0.9687 0.9495 0.9590 1367
23
+ NMS 0.9759 0.9560 0.9659 3181
24
+ NOUN 0.6164 0.8673 0.7206 113
25
+ NUM 0.6250 0.8333 0.7143 6
26
+ PART 1.0000 0.9375 0.9677 16
27
+ PDEMFP 1.0000 1.0000 1.0000 3
28
+ PDEMFS 1.0000 1.0000 1.0000 89
29
+ PDEMMP 1.0000 1.0000 1.0000 20
30
+ PDEMMS 1.0000 1.0000 1.0000 222
31
+ PINDFP 1.0000 1.0000 1.0000 3
32
+ PINDFS 0.8571 1.0000 0.9231 12
33
+ PINDMP 0.9000 1.0000 0.9474 9
34
+ PINDMS 0.9286 0.9701 0.9489 67
35
+ PINTFS 0.0000 0.0000 0.0000 2
36
+ PPER1S 1.0000 1.0000 1.0000 62
37
+ PPER2S 0.7500 1.0000 0.8571 3
38
+ PPER3FP 1.0000 1.0000 1.0000 9
39
+ PPER3FS 1.0000 1.0000 1.0000 96
40
+ PPER3MP 1.0000 1.0000 1.0000 31
41
+ PPER3MS 1.0000 1.0000 1.0000 377
42
+ PPOBJFP 1.0000 0.7500 0.8571 4
43
+ PPOBJFS 0.9167 0.8919 0.9041 37
44
+ PPOBJMP 0.7500 0.7500 0.7500 12
45
+ PPOBJMS 0.9371 0.9640 0.9504 139
46
+ PREF 1.0000 1.0000 1.0000 332
47
+ PREFP 1.0000 1.0000 1.0000 64
48
+ PREFS 1.0000 1.0000 1.0000 13
49
+ PREL 0.9964 0.9964 0.9964 277
50
+ PRELFP 1.0000 1.0000 1.0000 5
51
+ PRELFS 0.8000 1.0000 0.8889 4
52
+ PRELMP 1.0000 1.0000 1.0000 3
53
+ PRELMS 1.0000 1.0000 1.0000 11
54
+ PREP 0.9971 0.9977 0.9974 6161
55
+ PRON 0.9836 0.9836 0.9836 61
56
+ PROPN 0.9468 0.9503 0.9486 4310
57
+ PUNCT 1.0000 1.0000 1.0000 4019
58
+ SYM 0.9394 0.8158 0.8732 76
59
+ VERB 0.9956 0.9921 0.9938 2273
60
+ VPPFP 0.9145 0.9469 0.9304 113
61
+ VPPFS 0.9562 0.9597 0.9580 273
62
+ VPPMP 0.8827 0.9728 0.9256 147
63
+ VPPMS 0.9778 0.9794 0.9786 630
64
+ VPPRE 0.0000 0.0000 0.0000 1
65
+ X 0.9604 0.9935 0.9766 1073
66
+ XFAMIL 0.9386 0.9113 0.9248 1342
67
+ YPFOR 1.0000 1.0000 1.0000 2750
68
+
69
+ accuracy 0.9778 47574
70
+ macro avg 0.9151 0.9285 0.9202 47574
71
+ weighted avg 0.9785 0.9778 0.9780 47574
72
+
73
+ DatasetDict({
74
+ train: Dataset({
75
+ features: ['id', 'tokens', 'pos_tags'],
76
+ num_rows: 14453
77
+ })
78
+ validation: Dataset({
79
+ features: ['id', 'tokens', 'pos_tags'],
80
+ num_rows: 1477
81
+ })
82
+ test: Dataset({
83
+ features: ['id', 'tokens', 'pos_tags'],
84
+ num_rows: 417
85
+ })
86
+ })
optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6d4765236b34abe2f3ff11d7ad1724fd097175f43d78574642c1b1329128f8b2
3
+ size 880766181
predict.py ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from transformers import CamembertTokenizer, CamembertForTokenClassification, TokenClassificationPipeline
2
+
3
+ OUTPUT_PATH = './'
4
+
5
+ tokenizer = CamembertTokenizer.from_pretrained(OUTPUT_PATH)
6
+ model = CamembertForTokenClassification.from_pretrained(OUTPUT_PATH)
7
+
8
+ pos = TokenClassificationPipeline(model=model, tokenizer=tokenizer)
9
+
10
+ def make_prediction(sentence):
11
+ labels = [l['entity'] for l in pos(sentence)]
12
+ return list(zip(sentence.split(" "), labels))
13
+
14
+ res = make_prediction("George Washington est allé à Washington")
preview.PNG ADDED
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:995aba8183b98bcbeeeb548ea48ea9daea4f9143c62c0097738df40bdc21aa54
3
+ size 440410097
rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4bf1777801e2479389bdfb66a8aa6f361d4127c6a9b4249841100ffe937982fb
3
+ size 16543
scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9917c381e092c6b1349bc30c0b89ccbac8dda8de23bce2c62425520bd40ac616
3
+ size 623
sentencepiece.bpe.model ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:988bc5a00281c6d210a5d34bd143d0363741a432fefe741bf71e61b1869d4314
3
+ size 810912
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
1
+ {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": false}, "additional_special_tokens": ["<s>NOTUSED", "</s>NOTUSED"]}
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
1
+ {"bos_token": "<s>", "eos_token": "</s>", "sep_token": "</s>", "cls_token": "<s>", "unk_token": "<unk>", "pad_token": "<pad>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "additional_special_tokens": ["<s>NOTUSED", "</s>NOTUSED"], "model_max_length": 512, "special_tokens_map_file": null, "name_or_path": "camembert-base", "tokenizer_class": "CamembertTokenizer"}
trainer_state.json ADDED
@@ -0,0 +1,388 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 19.90049751243781,
5
+ "global_step": 12000,
6
+ "is_hyper_param_search": false,
7
+ "is_local_process_zero": true,
8
+ "is_world_process_zero": true,
9
+ "log_history": [
10
+ {
11
+ "epoch": 0.83,
12
+ "learning_rate": 4.792703150912106e-05,
13
+ "loss": 1.636,
14
+ "step": 500
15
+ },
16
+ {
17
+ "epoch": 1.0,
18
+ "eval_accuracy": 0.9660949257998066,
19
+ "eval_f1": 0.9648470438617105,
20
+ "eval_loss": 0.40596073865890503,
21
+ "eval_precision": 0.9632890778105937,
22
+ "eval_recall": 0.9664100575985822,
23
+ "eval_runtime": 12.913,
24
+ "eval_samples_per_second": 114.381,
25
+ "eval_steps_per_second": 4.801,
26
+ "step": 603
27
+ },
28
+ {
29
+ "epoch": 1.66,
30
+ "learning_rate": 4.5854063018242126e-05,
31
+ "loss": 0.3601,
32
+ "step": 1000
33
+ },
34
+ {
35
+ "epoch": 2.0,
36
+ "eval_accuracy": 0.9747971581115735,
37
+ "eval_f1": 0.9757520460075205,
38
+ "eval_loss": 0.16257813572883606,
39
+ "eval_precision": 0.9742435954063604,
40
+ "eval_recall": 0.9772651750110767,
41
+ "eval_runtime": 13.0776,
42
+ "eval_samples_per_second": 112.941,
43
+ "eval_steps_per_second": 4.741,
44
+ "step": 1206
45
+ },
46
+ {
47
+ "epoch": 2.49,
48
+ "learning_rate": 4.3781094527363184e-05,
49
+ "loss": 0.1654,
50
+ "step": 1500
51
+ },
52
+ {
53
+ "epoch": 3.0,
54
+ "eval_accuracy": 0.9755118341951486,
55
+ "eval_f1": 0.9770373954517177,
56
+ "eval_loss": 0.11993579566478729,
57
+ "eval_precision": 0.9755404025066946,
58
+ "eval_recall": 0.9785389898094816,
59
+ "eval_runtime": 12.7573,
60
+ "eval_samples_per_second": 115.777,
61
+ "eval_steps_per_second": 4.86,
62
+ "step": 1809
63
+ },
64
+ {
65
+ "epoch": 3.32,
66
+ "learning_rate": 4.170812603648425e-05,
67
+ "loss": 0.1051,
68
+ "step": 2000
69
+ },
70
+ {
71
+ "epoch": 4.0,
72
+ "eval_accuracy": 0.9774877033673856,
73
+ "eval_f1": 0.9789900275245854,
74
+ "eval_loss": 0.10314257442951202,
75
+ "eval_precision": 0.9779755160693067,
76
+ "eval_recall": 0.9800066459902526,
77
+ "eval_runtime": 12.7609,
78
+ "eval_samples_per_second": 115.744,
79
+ "eval_steps_per_second": 4.859,
80
+ "step": 2412
81
+ },
82
+ {
83
+ "epoch": 4.15,
84
+ "learning_rate": 3.9635157545605314e-05,
85
+ "loss": 0.072,
86
+ "step": 2500
87
+ },
88
+ {
89
+ "epoch": 4.98,
90
+ "learning_rate": 3.756218905472637e-05,
91
+ "loss": 0.0557,
92
+ "step": 3000
93
+ },
94
+ {
95
+ "epoch": 5.0,
96
+ "eval_accuracy": 0.9762895699331567,
97
+ "eval_f1": 0.979079382198808,
98
+ "eval_loss": 0.10313071310520172,
99
+ "eval_precision": 0.9777679582424259,
100
+ "eval_recall": 0.9803943287549844,
101
+ "eval_runtime": 12.8291,
102
+ "eval_samples_per_second": 115.129,
103
+ "eval_steps_per_second": 4.833,
104
+ "step": 3015
105
+ },
106
+ {
107
+ "epoch": 5.8,
108
+ "learning_rate": 3.548922056384743e-05,
109
+ "loss": 0.0406,
110
+ "step": 3500
111
+ },
112
+ {
113
+ "epoch": 6.0,
114
+ "eval_accuracy": 0.977109345440787,
115
+ "eval_f1": 0.9801621337465068,
116
+ "eval_loss": 0.10502293705940247,
117
+ "eval_precision": 0.9793221650909493,
118
+ "eval_recall": 0.9810035445281347,
119
+ "eval_runtime": 12.7666,
120
+ "eval_samples_per_second": 115.693,
121
+ "eval_steps_per_second": 4.856,
122
+ "step": 3618
123
+ },
124
+ {
125
+ "epoch": 6.63,
126
+ "learning_rate": 3.341625207296849e-05,
127
+ "loss": 0.0332,
128
+ "step": 4000
129
+ },
130
+ {
131
+ "epoch": 7.0,
132
+ "eval_accuracy": 0.9776558624458738,
133
+ "eval_f1": 0.9800943409276397,
134
+ "eval_loss": 0.10666096210479736,
135
+ "eval_precision": 0.9791868210840543,
136
+ "eval_recall": 0.9810035445281347,
137
+ "eval_runtime": 12.7474,
138
+ "eval_samples_per_second": 115.867,
139
+ "eval_steps_per_second": 4.864,
140
+ "step": 4221
141
+ },
142
+ {
143
+ "epoch": 7.46,
144
+ "learning_rate": 3.1343283582089554e-05,
145
+ "loss": 0.0248,
146
+ "step": 4500
147
+ },
148
+ {
149
+ "epoch": 8.0,
150
+ "eval_accuracy": 0.9782654391053937,
151
+ "eval_f1": 0.9810459324847814,
152
+ "eval_loss": 0.10935021936893463,
153
+ "eval_precision": 0.9802864410528644,
154
+ "eval_recall": 0.9818066016836509,
155
+ "eval_runtime": 12.9986,
156
+ "eval_samples_per_second": 113.628,
157
+ "eval_steps_per_second": 4.77,
158
+ "step": 4824
159
+ },
160
+ {
161
+ "epoch": 8.29,
162
+ "learning_rate": 2.9270315091210616e-05,
163
+ "loss": 0.0209,
164
+ "step": 5000
165
+ },
166
+ {
167
+ "epoch": 9.0,
168
+ "eval_accuracy": 0.9786648169168033,
169
+ "eval_f1": 0.9818035894668382,
170
+ "eval_loss": 0.1135268285870552,
171
+ "eval_precision": 0.9812197483059051,
172
+ "eval_recall": 0.9823881258307487,
173
+ "eval_runtime": 12.5526,
174
+ "eval_samples_per_second": 117.665,
175
+ "eval_steps_per_second": 4.939,
176
+ "step": 5427
177
+ },
178
+ {
179
+ "epoch": 9.12,
180
+ "learning_rate": 2.7197346600331674e-05,
181
+ "loss": 0.0177,
182
+ "step": 5500
183
+ },
184
+ {
185
+ "epoch": 9.95,
186
+ "learning_rate": 2.512437810945274e-05,
187
+ "loss": 0.0145,
188
+ "step": 6000
189
+ },
190
+ {
191
+ "epoch": 10.0,
192
+ "eval_accuracy": 0.9779291209484172,
193
+ "eval_f1": 0.9809597608900205,
194
+ "eval_loss": 0.12104799598455429,
195
+ "eval_precision": 0.980362871999115,
196
+ "eval_recall": 0.9815573770491803,
197
+ "eval_runtime": 12.8411,
198
+ "eval_samples_per_second": 115.021,
199
+ "eval_steps_per_second": 4.828,
200
+ "step": 6030
201
+ },
202
+ {
203
+ "epoch": 10.78,
204
+ "learning_rate": 2.3051409618573798e-05,
205
+ "loss": 0.0112,
206
+ "step": 6500
207
+ },
208
+ {
209
+ "epoch": 11.0,
210
+ "eval_accuracy": 0.9774246437129525,
211
+ "eval_f1": 0.9806444472114999,
212
+ "eval_loss": 0.12379806488752365,
213
+ "eval_precision": 0.9798988027760113,
214
+ "eval_recall": 0.9813912272928667,
215
+ "eval_runtime": 12.8231,
216
+ "eval_samples_per_second": 115.183,
217
+ "eval_steps_per_second": 4.835,
218
+ "step": 6633
219
+ },
220
+ {
221
+ "epoch": 11.61,
222
+ "learning_rate": 2.097844112769486e-05,
223
+ "loss": 0.0105,
224
+ "step": 7000
225
+ },
226
+ {
227
+ "epoch": 12.0,
228
+ "eval_accuracy": 0.9780552402572834,
229
+ "eval_f1": 0.9810323598179328,
230
+ "eval_loss": 0.1279357671737671,
231
+ "eval_precision": 0.9802593381072189,
232
+ "eval_recall": 0.9818066016836509,
233
+ "eval_runtime": 12.5624,
234
+ "eval_samples_per_second": 117.573,
235
+ "eval_steps_per_second": 4.935,
236
+ "step": 7236
237
+ },
238
+ {
239
+ "epoch": 12.44,
240
+ "learning_rate": 1.890547263681592e-05,
241
+ "loss": 0.0088,
242
+ "step": 7500
243
+ },
244
+ {
245
+ "epoch": 13.0,
246
+ "eval_accuracy": 0.9773405641737083,
247
+ "eval_f1": 0.9802169221404461,
248
+ "eval_loss": 0.1307593435049057,
249
+ "eval_precision": 0.9794039588632091,
250
+ "eval_recall": 0.981031236154187,
251
+ "eval_runtime": 12.6149,
252
+ "eval_samples_per_second": 117.084,
253
+ "eval_steps_per_second": 4.915,
254
+ "step": 7839
255
+ },
256
+ {
257
+ "epoch": 13.27,
258
+ "learning_rate": 1.6832504145936983e-05,
259
+ "loss": 0.0078,
260
+ "step": 8000
261
+ },
262
+ {
263
+ "epoch": 14.0,
264
+ "eval_accuracy": 0.9781182999117165,
265
+ "eval_f1": 0.980741027698608,
266
+ "eval_loss": 0.13244280219078064,
267
+ "eval_precision": 0.9800088480893657,
268
+ "eval_recall": 0.9814743021710235,
269
+ "eval_runtime": 12.7732,
270
+ "eval_samples_per_second": 115.633,
271
+ "eval_steps_per_second": 4.854,
272
+ "step": 8442
273
+ },
274
+ {
275
+ "epoch": 14.1,
276
+ "learning_rate": 1.4759535655058043e-05,
277
+ "loss": 0.0063,
278
+ "step": 8500
279
+ },
280
+ {
281
+ "epoch": 14.93,
282
+ "learning_rate": 1.2686567164179105e-05,
283
+ "loss": 0.0056,
284
+ "step": 9000
285
+ },
286
+ {
287
+ "epoch": 15.0,
288
+ "eval_accuracy": 0.9781393197965275,
289
+ "eval_f1": 0.9811566131710017,
290
+ "eval_loss": 0.13830867409706116,
291
+ "eval_precision": 0.9803970360539703,
292
+ "eval_recall": 0.98191736818786,
293
+ "eval_runtime": 12.5885,
294
+ "eval_samples_per_second": 117.329,
295
+ "eval_steps_per_second": 4.925,
296
+ "step": 9045
297
+ },
298
+ {
299
+ "epoch": 15.75,
300
+ "learning_rate": 1.0613598673300167e-05,
301
+ "loss": 0.0046,
302
+ "step": 9500
303
+ },
304
+ {
305
+ "epoch": 16.0,
306
+ "eval_accuracy": 0.9780972800269054,
307
+ "eval_f1": 0.9813613029099613,
308
+ "eval_loss": 0.1351945400238037,
309
+ "eval_precision": 0.9807506153718505,
310
+ "eval_recall": 0.9819727514399645,
311
+ "eval_runtime": 12.6382,
312
+ "eval_samples_per_second": 116.868,
313
+ "eval_steps_per_second": 4.906,
314
+ "step": 9648
315
+ },
316
+ {
317
+ "epoch": 16.58,
318
+ "learning_rate": 8.540630182421228e-06,
319
+ "loss": 0.0046,
320
+ "step": 10000
321
+ },
322
+ {
323
+ "epoch": 17.0,
324
+ "eval_accuracy": 0.977845041409173,
325
+ "eval_f1": 0.9810062666869561,
326
+ "eval_loss": 0.14340569078922272,
327
+ "eval_precision": 0.9801520387007602,
328
+ "eval_recall": 0.9818619849357554,
329
+ "eval_runtime": 12.6576,
330
+ "eval_samples_per_second": 116.688,
331
+ "eval_steps_per_second": 4.898,
332
+ "step": 10251
333
+ },
334
+ {
335
+ "epoch": 17.41,
336
+ "learning_rate": 6.467661691542288e-06,
337
+ "loss": 0.0033,
338
+ "step": 10500
339
+ },
340
+ {
341
+ "epoch": 18.0,
342
+ "eval_accuracy": 0.9776979022154959,
343
+ "eval_f1": 0.9810448835021307,
344
+ "eval_loss": 0.1425262689590454,
345
+ "eval_precision": 0.9803395642074991,
346
+ "eval_recall": 0.9817512184315463,
347
+ "eval_runtime": 12.8452,
348
+ "eval_samples_per_second": 114.985,
349
+ "eval_steps_per_second": 4.827,
350
+ "step": 10854
351
+ },
352
+ {
353
+ "epoch": 18.24,
354
+ "learning_rate": 4.39469320066335e-06,
355
+ "loss": 0.0032,
356
+ "step": 11000
357
+ },
358
+ {
359
+ "epoch": 19.0,
360
+ "eval_accuracy": 0.9780972800269054,
361
+ "eval_f1": 0.9813638816253684,
362
+ "eval_loss": 0.14492034912109375,
363
+ "eval_precision": 0.9806176901595377,
364
+ "eval_recall": 0.982111209570226,
365
+ "eval_runtime": 12.4458,
366
+ "eval_samples_per_second": 118.675,
367
+ "eval_steps_per_second": 4.982,
368
+ "step": 11457
369
+ },
370
+ {
371
+ "epoch": 19.07,
372
+ "learning_rate": 2.3217247097844113e-06,
373
+ "loss": 0.003,
374
+ "step": 11500
375
+ },
376
+ {
377
+ "epoch": 19.9,
378
+ "learning_rate": 2.4875621890547267e-07,
379
+ "loss": 0.0029,
380
+ "step": 12000
381
+ }
382
+ ],
383
+ "max_steps": 12060,
384
+ "num_train_epochs": 20,
385
+ "total_flos": 1.21984229568579e+16,
386
+ "trial_name": null,
387
+ "trial_params": null
388
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6f69d3406e6a240d9dca8d53ded465b0836fb3e46b9706b94b6e58db548f7051
3
+ size 2863