Update README.md

Browse files

Files changed (15) hide show

README.md +254 -0
config.json +162 -0
logs/logs.txt +86 -0
optimizer.pt +3 -0
predict.py +14 -0
preview.PNG +0 -0
pytorch_model.bin +3 -0
rng_state.pth +3 -0
scheduler.pt +3 -0
sentencepiece.bpe.model +3 -0
special_tokens_map.json +1 -0
tokenizer.json +0 -0
tokenizer_config.json +1 -0
trainer_state.json +388 -0
training_args.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,254 @@

+---
+tags:
+- flair
+- token-classification
+- sequence-tagger-model
+language: fr
+widget:
+- text: "George Washington est allé à Washington"
+---
+# POET: A French Extended Part-of-Speech Tagger
+- Corpora: [ANTILLES](https://github.com/qanastek/ANTILLES)
+- Embeddings: [CamemBERT](https://arxiv.org/abs/1911.03894)
+- Sequence Labelling: [Transformers](https://arxiv.org/abs/1706.03762)
+- Number of Epochs: 115
+**People Involved**
+* [LABRAK Yanis](https://www.linkedin.com/in/yanis-labrak-8a7412145/) (1)
+* [DUFOUR Richard](https://cv.archives-ouvertes.fr/richard-dufour) (2)
+**Affiliations**
+1. [LIA, NLP team](https://lia.univ-avignon.fr/), Avignon University, Avignon, France.
+2. [LS2N, TALN team](https://www.ls2n.fr/equipe/taln/), Nantes University, Nantes, France.
+## Demo: How to use in HuggingFace Transformers
+Requires [transformers](https://pypi.org/project/transformers/): ```pip install transformers```
+```python
+from transformers import CamembertTokenizer, CamembertForTokenClassification, TokenClassificationPipeline
+tokenizer = CamembertTokenizer.from_pretrained('./')
+model = CamembertForTokenClassification.from_pretrained('./')
+pos = TokenClassificationPipeline(model=model, tokenizer=tokenizer)
+def make_prediction(sentence):
+    labels = [l['entity'] for l in pos(sentence)]
+    return list(zip(sentence.split(" "), labels))
+res = make_prediction("George Washington est allé à Washington")
+```
+Output:
+![Preview Output](preview.PNG)
+## Training data
+`ANTILLES` is a part-of-speech tagging corpora based on [UD_French-GSD](https://universaldependencies.org/treebanks/fr_gsd/index.html) which was originally created in 2015 and is based on the [universal dependency treebank v2.0](https://github.com/ryanmcd/uni-dep-tb).
+Originally, the corpora consists of 400,399 words (16,341 sentences) and had 17 different classes. Now, after applying our tags augmentation we obtain 60 different classes which add linguistic and semantic information such as the gender, number, mood, person, tense or verb form given in the different CoNLL-03 fields from the original corpora.
+We based our tags on the level of details given by the [LIA_TAGG](http://pageperso.lif.univ-mrs.fr/frederic.bechet/download.html) statistical POS tagger written by [Frédéric Béchet](http://pageperso.lif.univ-mrs.fr/frederic.bechet/index-english.html) in 2001.
+The corpora used for this model is available on [Github](https://github.com/qanastek/ANTILLES) at the [CoNLL-U format](https://universaldependencies.org/format.html).
+Training data are fed to the model as free language and doesn't pass a normalization phase. Thus, it's made the model case and punctuation sensitive.
+## Original Tags
+```plain
+PRON VERB SCONJ ADP CCONJ DET NOUN ADJ AUX ADV PUNCT PROPN NUM SYM PART X INTJ
+```
+## New additional POS tags
+| Abbreviation | Description | Examples |
+|:--------:|:--------:|:--------:|
+| PREP | Preposition | de |
+| AUX | Auxiliary Verb | est |
+| ADV | Adverb | toujours |
+| COSUB | Subordinating conjunction | que |
+| COCO | Coordinating Conjunction | et |
+| PART | Demonstrative particle | -t |
+| PRON | Pronoun | qui ce quoi |
+| PDEMMS | Demonstrative Pronoun - Singular Masculine | ce |
+| PDEMMP | Demonstrative Pronoun - Plural Masculine | ceux |
+| PDEMFS | Demonstrative Pronoun - Singular Feminine | cette |
+| PDEMFP | Demonstrative Pronoun - Plural Feminine | celles |
+| PINDMS | Indefinite Pronoun - Singular Masculine | tout |
+| PINDMP | Indefinite Pronoun - Plural Masculine | autres |
+| PINDFS | Indefinite Pronoun - Singular Feminine | chacune |
+| PINDFP | Indefinite Pronoun - Plural Feminine | certaines |
+| PROPN | Proper noun | Houston |
+| XFAMIL | Last name | Levy |
+| NUM | Numerical Adjective | trentaine vingtaine |
+| DINTMS | Masculine Numerical Adjective | un |
+| DINTFS | Feminine Numerical Adjective | une |
+| PPOBJMS | Pronoun complements of objects - Singular Masculine | le lui |
+| PPOBJMP | Pronoun complements of objects - Plural Masculine | eux y |
+| PPOBJFS | Pronoun complements of objects - Singular Feminine | moi la |
+| PPOBJFP | Pronoun complements of objects - Plural Feminine | en y |
+| PPER1S | Personal Pronoun First-Person - Singular | je |
+| PPER2S | Personal Pronoun Second-Person - Singular | tu |
+| PPER3MS | Personal Pronoun Third-Person - Singular Masculine | il |
+| PPER3MP | Personal Pronoun Third-Person - Plural Masculine | ils |
+| PPER3FS | Personal Pronoun Third-Person - Singular Feminine | elle |
+| PPER3FP | Personal Pronoun Third-Person - Plural Feminine | elles |
+| PREFS | Reflexive Pronoun First-Person - Singular | me m' |
+| PREF | Reflexive Pronoun Third-Person - Singular | se s' |
+| PREFP | Reflexive Pronoun First / Second-Person - Plural | nous vous |
+| VERB | Verb | obtient |
+| VPPMS | Past Participle - Singular Masculine | formulé |
+| VPPMP | Past Participle - Plural Masculine | classés |
+| VPPFS | Past Participle - Singular Feminine | appelée |
+| VPPFP | Past Participle - Plural Feminine | sanctionnées |
+| DET | Determinant | les l' |
+| DETMS | Determinant - Singular Masculine | les |
+| DETFS | Determinant - Singular Feminine | la |
+| ADJ | Adjective | capable sérieux |
+| ADJMS | Adjective - Singular Masculine | grand important |
+| ADJMP | Adjective - Plural Masculine | grands petits |
+| ADJFS | Adjective - Singular Feminine | française petite |
+| ADJFP | Adjective - Plural Feminine | légères petites |
+| NOUN | Noun | temps |
+| NMS | Noun - Singular Masculine | drapeau |
+| NMP | Noun - Plural Masculine | journalistes |
+| NFS | Noun - Singular Feminine | tête |
+| NFP | Noun - Plural Feminine | ondes |
+| PREL | Relative Pronoun | qui dont |
+| PRELMS | Relative Pronoun - Singular Masculine | lequel |
+| PRELMP | Relative Pronoun - Plural Masculine | lesquels |
+| PRELFS | Relative Pronoun - Singular Feminine | laquelle |
+| PRELFP | Relative Pronoun - Plural Feminine | lesquelles |
+| INTJ | Interjection | merci bref |
+| CHIF | Numbers | 1979 10 |
+| SYM | Symbol | € % |
+| YPFOR | Endpoint | . |
+| PUNCT | Ponctuation | : , |
+| MOTINC | Unknown words | Technology Lady |
+| X | Typos & others | sfeir 3D statu |
+## Evaluation results
+The test corpora used for this evaluation is available on [Github](https://github.com/qanastek/ANTILLES/blob/main/ANTILLES/test.conllu).
+```plain
+              precision    recall  f1-score   support
+         ADJ     0.9040    0.8828    0.8933       128
+       ADJFP     0.9811    0.9585    0.9697       434
+       ADJFS     0.9606    0.9826    0.9715       918
+       ADJMP     0.9613    0.9357    0.9483       451
+       ADJMS     0.9561    0.9611    0.9586       952
+         ADV     0.9870    0.9948    0.9908      1524
+         AUX     0.9956    0.9964    0.9960      1124
+        CHIF     0.9798    0.9774    0.9786      1239
+        COCO     1.0000    0.9989    0.9994       884
+       COSUB     0.9939    0.9939    0.9939       328
+         DET     0.9972    0.9972    0.9972      2897
+       DETFS     0.9990    1.0000    0.9995      1007
+       DETMS     1.0000    0.9993    0.9996      1426
+      DINTFS     0.9967    0.9902    0.9934       306
+      DINTMS     0.9923    0.9948    0.9935       387
+        INTJ     0.8000    0.8000    0.8000         5
+      MOTINC     0.5049    0.5827    0.5410       266
+         NFP     0.9807    0.9675    0.9740       892
+         NFS     0.9778    0.9699    0.9738      2588
+         NMP     0.9687    0.9495    0.9590      1367
+         NMS     0.9759    0.9560    0.9659      3181
+        NOUN     0.6164    0.8673    0.7206       113
+         NUM     0.6250    0.8333    0.7143         6
+        PART     1.0000    0.9375    0.9677        16
+      PDEMFP     1.0000    1.0000    1.0000         3
+      PDEMFS     1.0000    1.0000    1.0000        89
+      PDEMMP     1.0000    1.0000    1.0000        20
+      PDEMMS     1.0000    1.0000    1.0000       222
+      PINDFP     1.0000    1.0000    1.0000         3
+      PINDFS     0.8571    1.0000    0.9231        12
+      PINDMP     0.9000    1.0000    0.9474         9
+      PINDMS     0.9286    0.9701    0.9489        67
+      PINTFS     0.0000    0.0000    0.0000         2
+      PPER1S     1.0000    1.0000    1.0000        62
+      PPER2S     0.7500    1.0000    0.8571         3
+     PPER3FP     1.0000    1.0000    1.0000         9
+     PPER3FS     1.0000    1.0000    1.0000        96
+     PPER3MP     1.0000    1.0000    1.0000        31
+     PPER3MS     1.0000    1.0000    1.0000       377
+     PPOBJFP     1.0000    0.7500    0.8571         4
+     PPOBJFS     0.9167    0.8919    0.9041        37
+     PPOBJMP     0.7500    0.7500    0.7500        12
+     PPOBJMS     0.9371    0.9640    0.9504       139
+        PREF     1.0000    1.0000    1.0000       332
+       PREFP     1.0000    1.0000    1.0000        64
+       PREFS     1.0000    1.0000    1.0000        13
+        PREL     0.9964    0.9964    0.9964       277
+      PRELFP     1.0000    1.0000    1.0000         5
+      PRELFS     0.8000    1.0000    0.8889         4
+      PRELMP     1.0000    1.0000    1.0000         3
+      PRELMS     1.0000    1.0000    1.0000        11
+        PREP     0.9971    0.9977    0.9974      6161
+        PRON     0.9836    0.9836    0.9836        61
+       PROPN     0.9468    0.9503    0.9486      4310
+       PUNCT     1.0000    1.0000    1.0000      4019
+         SYM     0.9394    0.8158    0.8732        76
+        VERB     0.9956    0.9921    0.9938      2273
+       VPPFP     0.9145    0.9469    0.9304       113
+       VPPFS     0.9562    0.9597    0.9580       273
+       VPPMP     0.8827    0.9728    0.9256       147
+       VPPMS     0.9778    0.9794    0.9786       630
+       VPPRE     0.0000    0.0000    0.0000         1
+           X     0.9604    0.9935    0.9766      1073
+      XFAMIL     0.9386    0.9113    0.9248      1342
+       YPFOR     1.0000    1.0000    1.0000      2750
+    accuracy                         0.9778     47574
+   macro avg     0.9151    0.9285    0.9202     47574
+weighted avg     0.9785    0.9778    0.9780     47574
+```
+## BibTeX Citations
+Please cite the following paper when using this model.
+UD_French-GSD corpora:
+```latex
+@misc{
+    universaldependencies,
+    title={UniversalDependencies/UD_French-GSD},
+    url={https://github.com/UniversalDependencies/UD_French-GSD}, journal={GitHub},
+    author={UniversalDependencies}
+}
+```
+LIA TAGG:
+```latex
+@techreport{LIA_TAGG,
+  author = {Frédéric Béchet},
+  title = {LIA_TAGG: a statistical POS tagger + syntactic bracketer},
+  institution = {Aix-Marseille University & CNRS},
+  year = {2001}
+}
+```
+Flair Embeddings:
+```latex
+@inproceedings{akbik2018coling,
+  title={Contextual String Embeddings for Sequence Labeling},
+  author={Akbik, Alan and Blythe, Duncan and Vollgraf, Roland},
+  booktitle = {{COLING} 2018, 27th International Conference on Computational Linguistics},
+  pages     = {1638--1649},
+  year      = {2018}
+}
+```
+## Acknowledgment
+This work was financially supported by [Zenidoc](https://zenidoc.fr/)

config.json ADDED Viewed

	@@ -0,0 +1,162 @@

+{
+  "_name_or_path": "camembert-base",
+  "architectures": [
+    "CamembertForTokenClassification"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "bos_token_id": 5,
+  "classifier_dropout": null,
+  "eos_token_id": 6,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "PART",
+    "1": "PDEMMP",
+    "10": "NOUN",
+    "11": "PPER3MS",
+    "12": "AUX",
+    "13": "COSUB",
+    "14": "ADJ",
+    "15": "VPPRE",
+    "16": "COCO",
+    "17": "ADJMP",
+    "18": "X",
+    "19": "NMS",
+    "2": "PREFS",
+    "20": "PINDMS",
+    "21": "DETFS",
+    "22": "PPER2S",
+    "23": "PREFP",
+    "24": "PPER3MP",
+    "25": "PRELMP",
+    "26": "PINDFS",
+    "27": "PRON",
+    "28": "PREP",
+    "29": "PPOBJMP",
+    "3": "PINDMP",
+    "30": "ADJFS",
+    "31": "DET",
+    "32": "ADJFP",
+    "33": "PDEMFP",
+    "34": "PREL",
+    "35": "PPER3FS",
+    "36": "VPPFS",
+    "37": "PPER3FP",
+    "38": "CHIF",
+    "39": "NMP",
+    "4": "DINTMS",
+    "40": "SYM",
+    "41": "NFS",
+    "42": "VERB",
+    "43": "PREF",
+    "44": "VPPFP",
+    "45": "PDEMMS",
+    "46": "XFAMIL",
+    "47": "PINDFP",
+    "48": "VPPMP",
+    "49": "YPFOR",
+    "5": "NUM",
+    "50": "ADV",
+    "51": "PRELFS",
+    "52": "DINTFS",
+    "53": "DETMS",
+    "54": "PPOBJFP",
+    "55": "PPOBJMS",
+    "56": "VPPMS",
+    "57": "INTJ",
+    "58": "PROPN",
+    "59": "PDEMFS",
+    "6": "PINTFS",
+    "60": "PPER1S",
+    "61": "PRELFP",
+    "62": "MOTINC",
+    "63": "ADJMS",
+    "64": "PPOBJFS",
+    "7": "NFP",
+    "8": "PUNCT",
+    "9": "PRELMS"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "ADJ": "14",
+    "ADJFP": "32",
+    "ADJFS": "30",
+    "ADJMP": "17",
+    "ADJMS": "63",
+    "ADV": "50",
+    "AUX": "12",
+    "CHIF": "38",
+    "COCO": "16",
+    "COSUB": "13",
+    "DET": "31",
+    "DETFS": "21",
+    "DETMS": "53",
+    "DINTFS": "52",
+    "DINTMS": "4",
+    "INTJ": "57",
+    "MOTINC": "62",
+    "NFP": "7",
+    "NFS": "41",
+    "NMP": "39",
+    "NMS": "19",
+    "NOUN": "10",
+    "NUM": "5",
+    "PART": "0",
+    "PDEMFP": "33",
+    "PDEMFS": "59",
+    "PDEMMP": "1",
+    "PDEMMS": "45",
+    "PINDFP": "47",
+    "PINDFS": "26",
+    "PINDMP": "3",
+    "PINDMS": "20",
+    "PINTFS": "6",
+    "PPER1S": "60",
+    "PPER2S": "22",
+    "PPER3FP": "37",
+    "PPER3FS": "35",
+    "PPER3MP": "24",
+    "PPER3MS": "11",
+    "PPOBJFP": "54",
+    "PPOBJFS": "64",
+    "PPOBJMP": "29",
+    "PPOBJMS": "55",
+    "PREF": "43",
+    "PREFP": "23",
+    "PREFS": "2",
+    "PREL": "34",
+    "PRELFP": "61",
+    "PRELFS": "51",
+    "PRELMP": "25",
+    "PRELMS": "9",
+    "PREP": "28",
+    "PRON": "27",
+    "PROPN": "58",
+    "PUNCT": "8",
+    "SYM": "40",
+    "VERB": "42",
+    "VPPFP": "44",
+    "VPPFS": "36",
+    "VPPMP": "48",
+    "VPPMS": "56",
+    "VPPRE": "15",
+    "X": "18",
+    "XFAMIL": "46",
+    "YPFOR": "49"
+  },
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 514,
+  "model_type": "camembert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "output_past": true,
+  "pad_token_id": 1,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.12.5",
+  "type_vocab_size": 1,
+  "use_cache": true,
+  "vocab_size": 32005
+}

logs/logs.txt ADDED Viewed

	@@ -0,0 +1,86 @@

+              precision    recall  f1-score   support
+         ADJ     0.9040    0.8828    0.8933       128
+       ADJFP     0.9811    0.9585    0.9697       434
+       ADJFS     0.9606    0.9826    0.9715       918
+       ADJMP     0.9613    0.9357    0.9483       451
+       ADJMS     0.9561    0.9611    0.9586       952
+         ADV     0.9870    0.9948    0.9908      1524
+         AUX     0.9956    0.9964    0.9960      1124
+        CHIF     0.9798    0.9774    0.9786      1239
+        COCO     1.0000    0.9989    0.9994       884
+       COSUB     0.9939    0.9939    0.9939       328
+         DET     0.9972    0.9972    0.9972      2897
+       DETFS     0.9990    1.0000    0.9995      1007
+       DETMS     1.0000    0.9993    0.9996      1426
+      DINTFS     0.9967    0.9902    0.9934       306
+      DINTMS     0.9923    0.9948    0.9935       387
+        INTJ     0.8000    0.8000    0.8000         5
+      MOTINC     0.5049    0.5827    0.5410       266
+         NFP     0.9807    0.9675    0.9740       892
+         NFS     0.9778    0.9699    0.9738      2588
+         NMP     0.9687    0.9495    0.9590      1367
+         NMS     0.9759    0.9560    0.9659      3181
+        NOUN     0.6164    0.8673    0.7206       113
+         NUM     0.6250    0.8333    0.7143         6
+        PART     1.0000    0.9375    0.9677        16
+      PDEMFP     1.0000    1.0000    1.0000         3
+      PDEMFS     1.0000    1.0000    1.0000        89
+      PDEMMP     1.0000    1.0000    1.0000        20
+      PDEMMS     1.0000    1.0000    1.0000       222
+      PINDFP     1.0000    1.0000    1.0000         3
+      PINDFS     0.8571    1.0000    0.9231        12
+      PINDMP     0.9000    1.0000    0.9474         9
+      PINDMS     0.9286    0.9701    0.9489        67
+      PINTFS     0.0000    0.0000    0.0000         2
+      PPER1S     1.0000    1.0000    1.0000        62
+      PPER2S     0.7500    1.0000    0.8571         3
+     PPER3FP     1.0000    1.0000    1.0000         9
+     PPER3FS     1.0000    1.0000    1.0000        96
+     PPER3MP     1.0000    1.0000    1.0000        31
+     PPER3MS     1.0000    1.0000    1.0000       377
+     PPOBJFP     1.0000    0.7500    0.8571         4
+     PPOBJFS     0.9167    0.8919    0.9041        37
+     PPOBJMP     0.7500    0.7500    0.7500        12
+     PPOBJMS     0.9371    0.9640    0.9504       139
+        PREF     1.0000    1.0000    1.0000       332
+       PREFP     1.0000    1.0000    1.0000        64
+       PREFS     1.0000    1.0000    1.0000        13
+        PREL     0.9964    0.9964    0.9964       277
+      PRELFP     1.0000    1.0000    1.0000         5
+      PRELFS     0.8000    1.0000    0.8889         4
+      PRELMP     1.0000    1.0000    1.0000         3
+      PRELMS     1.0000    1.0000    1.0000        11
+        PREP     0.9971    0.9977    0.9974      6161
+        PRON     0.9836    0.9836    0.9836        61
+       PROPN     0.9468    0.9503    0.9486      4310
+       PUNCT     1.0000    1.0000    1.0000      4019
+         SYM     0.9394    0.8158    0.8732        76
+        VERB     0.9956    0.9921    0.9938      2273
+       VPPFP     0.9145    0.9469    0.9304       113
+       VPPFS     0.9562    0.9597    0.9580       273
+       VPPMP     0.8827    0.9728    0.9256       147
+       VPPMS     0.9778    0.9794    0.9786       630
+       VPPRE     0.0000    0.0000    0.0000         1
+           X     0.9604    0.9935    0.9766      1073
+      XFAMIL     0.9386    0.9113    0.9248      1342
+       YPFOR     1.0000    1.0000    1.0000      2750
+    accuracy                         0.9778     47574
+   macro avg     0.9151    0.9285    0.9202     47574
+weighted avg     0.9785    0.9778    0.9780     47574
+DatasetDict({
+    train: Dataset({
+        features: ['id', 'tokens', 'pos_tags'],
+        num_rows: 14453
+    })
+    validation: Dataset({
+        features: ['id', 'tokens', 'pos_tags'],
+        num_rows: 1477
+    })
+    test: Dataset({
+        features: ['id', 'tokens', 'pos_tags'],
+        num_rows: 417
+    })
+})

optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6d4765236b34abe2f3ff11d7ad1724fd097175f43d78574642c1b1329128f8b2
+size 880766181

predict.py ADDED Viewed

	@@ -0,0 +1,14 @@

+from transformers import CamembertTokenizer, CamembertForTokenClassification, TokenClassificationPipeline
+OUTPUT_PATH = './'
+tokenizer = CamembertTokenizer.from_pretrained(OUTPUT_PATH)
+model = CamembertForTokenClassification.from_pretrained(OUTPUT_PATH)
+pos = TokenClassificationPipeline(model=model, tokenizer=tokenizer)
+def make_prediction(sentence):
+    labels = [l['entity'] for l in pos(sentence)]
+    return list(zip(sentence.split(" "), labels))
+res = make_prediction("George Washington est allé à Washington")

preview.PNG ADDED Viewed

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:995aba8183b98bcbeeeb548ea48ea9daea4f9143c62c0097738df40bdc21aa54
+size 440410097

rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4bf1777801e2479389bdfb66a8aa6f361d4127c6a9b4249841100ffe937982fb
+size 16543

scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9917c381e092c6b1349bc30c0b89ccbac8dda8de23bce2c62425520bd40ac616
+size 623

sentencepiece.bpe.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:988bc5a00281c6d210a5d34bd143d0363741a432fefe741bf71e61b1869d4314
+size 810912

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": false}, "additional_special_tokens": ["<s>NOTUSED", "</s>NOTUSED"]}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1 @@

+ {"bos_token": "<s>", "eos_token": "</s>", "sep_token": "</s>", "cls_token": "<s>", "unk_token": "<unk>", "pad_token": "<pad>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "additional_special_tokens": ["<s>NOTUSED", "</s>NOTUSED"], "model_max_length": 512, "special_tokens_map_file": null, "name_or_path": "camembert-base", "tokenizer_class": "CamembertTokenizer"}

trainer_state.json ADDED Viewed

	@@ -0,0 +1,388 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 19.90049751243781,
+  "global_step": 12000,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.83,
+      "learning_rate": 4.792703150912106e-05,
+      "loss": 1.636,
+      "step": 500
+    },
+    {
+      "epoch": 1.0,
+      "eval_accuracy": 0.9660949257998066,
+      "eval_f1": 0.9648470438617105,
+      "eval_loss": 0.40596073865890503,
+      "eval_precision": 0.9632890778105937,
+      "eval_recall": 0.9664100575985822,
+      "eval_runtime": 12.913,
+      "eval_samples_per_second": 114.381,
+      "eval_steps_per_second": 4.801,
+      "step": 603
+    },
+    {
+      "epoch": 1.66,
+      "learning_rate": 4.5854063018242126e-05,
+      "loss": 0.3601,
+      "step": 1000
+    },
+    {
+      "epoch": 2.0,
+      "eval_accuracy": 0.9747971581115735,
+      "eval_f1": 0.9757520460075205,
+      "eval_loss": 0.16257813572883606,
+      "eval_precision": 0.9742435954063604,
+      "eval_recall": 0.9772651750110767,
+      "eval_runtime": 13.0776,
+      "eval_samples_per_second": 112.941,
+      "eval_steps_per_second": 4.741,
+      "step": 1206
+    },
+    {
+      "epoch": 2.49,
+      "learning_rate": 4.3781094527363184e-05,
+      "loss": 0.1654,
+      "step": 1500
+    },
+    {
+      "epoch": 3.0,
+      "eval_accuracy": 0.9755118341951486,
+      "eval_f1": 0.9770373954517177,
+      "eval_loss": 0.11993579566478729,
+      "eval_precision": 0.9755404025066946,
+      "eval_recall": 0.9785389898094816,
+      "eval_runtime": 12.7573,
+      "eval_samples_per_second": 115.777,
+      "eval_steps_per_second": 4.86,
+      "step": 1809
+    },
+    {
+      "epoch": 3.32,
+      "learning_rate": 4.170812603648425e-05,
+      "loss": 0.1051,
+      "step": 2000
+    },
+    {
+      "epoch": 4.0,
+      "eval_accuracy": 0.9774877033673856,
+      "eval_f1": 0.9789900275245854,
+      "eval_loss": 0.10314257442951202,
+      "eval_precision": 0.9779755160693067,
+      "eval_recall": 0.9800066459902526,
+      "eval_runtime": 12.7609,
+      "eval_samples_per_second": 115.744,
+      "eval_steps_per_second": 4.859,
+      "step": 2412
+    },
+    {
+      "epoch": 4.15,
+      "learning_rate": 3.9635157545605314e-05,
+      "loss": 0.072,
+      "step": 2500
+    },
+    {
+      "epoch": 4.98,
+      "learning_rate": 3.756218905472637e-05,
+      "loss": 0.0557,
+      "step": 3000
+    },
+    {
+      "epoch": 5.0,
+      "eval_accuracy": 0.9762895699331567,
+      "eval_f1": 0.979079382198808,
+      "eval_loss": 0.10313071310520172,
+      "eval_precision": 0.9777679582424259,
+      "eval_recall": 0.9803943287549844,
+      "eval_runtime": 12.8291,
+      "eval_samples_per_second": 115.129,
+      "eval_steps_per_second": 4.833,
+      "step": 3015
+    },
+    {
+      "epoch": 5.8,
+      "learning_rate": 3.548922056384743e-05,
+      "loss": 0.0406,
+      "step": 3500
+    },
+    {
+      "epoch": 6.0,
+      "eval_accuracy": 0.977109345440787,
+      "eval_f1": 0.9801621337465068,
+      "eval_loss": 0.10502293705940247,
+      "eval_precision": 0.9793221650909493,
+      "eval_recall": 0.9810035445281347,
+      "eval_runtime": 12.7666,
+      "eval_samples_per_second": 115.693,
+      "eval_steps_per_second": 4.856,
+      "step": 3618
+    },
+    {
+      "epoch": 6.63,
+      "learning_rate": 3.341625207296849e-05,
+      "loss": 0.0332,
+      "step": 4000
+    },
+    {
+      "epoch": 7.0,
+      "eval_accuracy": 0.9776558624458738,
+      "eval_f1": 0.9800943409276397,
+      "eval_loss": 0.10666096210479736,
+      "eval_precision": 0.9791868210840543,
+      "eval_recall": 0.9810035445281347,
+      "eval_runtime": 12.7474,
+      "eval_samples_per_second": 115.867,
+      "eval_steps_per_second": 4.864,
+      "step": 4221
+    },
+    {
+      "epoch": 7.46,
+      "learning_rate": 3.1343283582089554e-05,
+      "loss": 0.0248,
+      "step": 4500
+    },
+    {
+      "epoch": 8.0,
+      "eval_accuracy": 0.9782654391053937,
+      "eval_f1": 0.9810459324847814,
+      "eval_loss": 0.10935021936893463,
+      "eval_precision": 0.9802864410528644,
+      "eval_recall": 0.9818066016836509,
+      "eval_runtime": 12.9986,
+      "eval_samples_per_second": 113.628,
+      "eval_steps_per_second": 4.77,
+      "step": 4824
+    },
+    {
+      "epoch": 8.29,
+      "learning_rate": 2.9270315091210616e-05,
+      "loss": 0.0209,
+      "step": 5000
+    },
+    {
+      "epoch": 9.0,
+      "eval_accuracy": 0.9786648169168033,
+      "eval_f1": 0.9818035894668382,
+      "eval_loss": 0.1135268285870552,
+      "eval_precision": 0.9812197483059051,
+      "eval_recall": 0.9823881258307487,
+      "eval_runtime": 12.5526,
+      "eval_samples_per_second": 117.665,
+      "eval_steps_per_second": 4.939,
+      "step": 5427
+    },
+    {
+      "epoch": 9.12,
+      "learning_rate": 2.7197346600331674e-05,
+      "loss": 0.0177,
+      "step": 5500
+    },
+    {
+      "epoch": 9.95,
+      "learning_rate": 2.512437810945274e-05,
+      "loss": 0.0145,
+      "step": 6000
+    },
+    {
+      "epoch": 10.0,
+      "eval_accuracy": 0.9779291209484172,
+      "eval_f1": 0.9809597608900205,
+      "eval_loss": 0.12104799598455429,
+      "eval_precision": 0.980362871999115,
+      "eval_recall": 0.9815573770491803,
+      "eval_runtime": 12.8411,
+      "eval_samples_per_second": 115.021,
+      "eval_steps_per_second": 4.828,
+      "step": 6030
+    },
+    {
+      "epoch": 10.78,
+      "learning_rate": 2.3051409618573798e-05,
+      "loss": 0.0112,
+      "step": 6500
+    },
+    {
+      "epoch": 11.0,
+      "eval_accuracy": 0.9774246437129525,
+      "eval_f1": 0.9806444472114999,
+      "eval_loss": 0.12379806488752365,
+      "eval_precision": 0.9798988027760113,
+      "eval_recall": 0.9813912272928667,
+      "eval_runtime": 12.8231,
+      "eval_samples_per_second": 115.183,
+      "eval_steps_per_second": 4.835,
+      "step": 6633
+    },
+    {
+      "epoch": 11.61,
+      "learning_rate": 2.097844112769486e-05,
+      "loss": 0.0105,
+      "step": 7000
+    },
+    {
+      "epoch": 12.0,
+      "eval_accuracy": 0.9780552402572834,
+      "eval_f1": 0.9810323598179328,
+      "eval_loss": 0.1279357671737671,
+      "eval_precision": 0.9802593381072189,
+      "eval_recall": 0.9818066016836509,
+      "eval_runtime": 12.5624,
+      "eval_samples_per_second": 117.573,
+      "eval_steps_per_second": 4.935,
+      "step": 7236
+    },
+    {
+      "epoch": 12.44,
+      "learning_rate": 1.890547263681592e-05,
+      "loss": 0.0088,
+      "step": 7500
+    },
+    {
+      "epoch": 13.0,
+      "eval_accuracy": 0.9773405641737083,
+      "eval_f1": 0.9802169221404461,
+      "eval_loss": 0.1307593435049057,
+      "eval_precision": 0.9794039588632091,
+      "eval_recall": 0.981031236154187,
+      "eval_runtime": 12.6149,
+      "eval_samples_per_second": 117.084,
+      "eval_steps_per_second": 4.915,
+      "step": 7839
+    },
+    {
+      "epoch": 13.27,
+      "learning_rate": 1.6832504145936983e-05,
+      "loss": 0.0078,
+      "step": 8000
+    },
+    {
+      "epoch": 14.0,
+      "eval_accuracy": 0.9781182999117165,
+      "eval_f1": 0.980741027698608,
+      "eval_loss": 0.13244280219078064,
+      "eval_precision": 0.9800088480893657,
+      "eval_recall": 0.9814743021710235,
+      "eval_runtime": 12.7732,
+      "eval_samples_per_second": 115.633,
+      "eval_steps_per_second": 4.854,
+      "step": 8442
+    },
+    {
+      "epoch": 14.1,
+      "learning_rate": 1.4759535655058043e-05,
+      "loss": 0.0063,
+      "step": 8500
+    },
+    {
+      "epoch": 14.93,
+      "learning_rate": 1.2686567164179105e-05,
+      "loss": 0.0056,
+      "step": 9000
+    },
+    {
+      "epoch": 15.0,
+      "eval_accuracy": 0.9781393197965275,
+      "eval_f1": 0.9811566131710017,
+      "eval_loss": 0.13830867409706116,
+      "eval_precision": 0.9803970360539703,
+      "eval_recall": 0.98191736818786,
+      "eval_runtime": 12.5885,
+      "eval_samples_per_second": 117.329,
+      "eval_steps_per_second": 4.925,
+      "step": 9045
+    },
+    {
+      "epoch": 15.75,
+      "learning_rate": 1.0613598673300167e-05,
+      "loss": 0.0046,
+      "step": 9500
+    },
+    {
+      "epoch": 16.0,
+      "eval_accuracy": 0.9780972800269054,
+      "eval_f1": 0.9813613029099613,
+      "eval_loss": 0.1351945400238037,
+      "eval_precision": 0.9807506153718505,
+      "eval_recall": 0.9819727514399645,
+      "eval_runtime": 12.6382,
+      "eval_samples_per_second": 116.868,
+      "eval_steps_per_second": 4.906,
+      "step": 9648
+    },
+    {
+      "epoch": 16.58,
+      "learning_rate": 8.540630182421228e-06,
+      "loss": 0.0046,
+      "step": 10000
+    },
+    {
+      "epoch": 17.0,
+      "eval_accuracy": 0.977845041409173,
+      "eval_f1": 0.9810062666869561,
+      "eval_loss": 0.14340569078922272,
+      "eval_precision": 0.9801520387007602,
+      "eval_recall": 0.9818619849357554,
+      "eval_runtime": 12.6576,
+      "eval_samples_per_second": 116.688,
+      "eval_steps_per_second": 4.898,
+      "step": 10251
+    },
+    {
+      "epoch": 17.41,
+      "learning_rate": 6.467661691542288e-06,
+      "loss": 0.0033,
+      "step": 10500
+    },
+    {
+      "epoch": 18.0,
+      "eval_accuracy": 0.9776979022154959,
+      "eval_f1": 0.9810448835021307,
+      "eval_loss": 0.1425262689590454,
+      "eval_precision": 0.9803395642074991,
+      "eval_recall": 0.9817512184315463,
+      "eval_runtime": 12.8452,
+      "eval_samples_per_second": 114.985,
+      "eval_steps_per_second": 4.827,
+      "step": 10854
+    },
+    {
+      "epoch": 18.24,
+      "learning_rate": 4.39469320066335e-06,
+      "loss": 0.0032,
+      "step": 11000
+    },
+    {
+      "epoch": 19.0,
+      "eval_accuracy": 0.9780972800269054,
+      "eval_f1": 0.9813638816253684,
+      "eval_loss": 0.14492034912109375,
+      "eval_precision": 0.9806176901595377,
+      "eval_recall": 0.982111209570226,
+      "eval_runtime": 12.4458,
+      "eval_samples_per_second": 118.675,
+      "eval_steps_per_second": 4.982,
+      "step": 11457
+    },
+    {
+      "epoch": 19.07,
+      "learning_rate": 2.3217247097844113e-06,
+      "loss": 0.003,
+      "step": 11500
+    },
+    {
+      "epoch": 19.9,
+      "learning_rate": 2.4875621890547267e-07,
+      "loss": 0.0029,
+      "step": 12000
+    }
+  ],
+  "max_steps": 12060,
+  "num_train_epochs": 20,
+  "total_flos": 1.21984229568579e+16,
+  "trial_name": null,
+  "trial_params": null
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6f69d3406e6a240d9dca8d53ded465b0836fb3e46b9706b94b6e58db548f7051
+size 2863