binbin83
/

fr_lexical_death

Token Classification

Model card Files Files and versions Community

binbin83 commited on Oct 5, 2023

Commit

7db14b6

•

1 Parent(s): 4b7ceac

Update README.md

Files changed (1) hide show

README.md +27 -1

README.md CHANGED Viewed

@@ -20,7 +20,19 @@ model-index:
     - name: NER F Score
       type: f_score
       value: 0.776119403
 ---
 | Feature | Description |
 | --- | --- |
 | **Name** | `fr_lexical_death` |
@@ -53,4 +65,18 @@ model-index:
 | `ENTS_P` | 82.54 |
 | `ENTS_R` | 73.24 |
 | `TRANSFORMER_LOSS` | 51778.17 |
-| `NER_LOSS` | 41163.78 |

     - name: NER F Score
       type: f_score
       value: 0.776119403
+license: agpl-3.0
+widget:
+- example 1: "Il faut pas sortir, vous reviendrez pas vivantes."
+- example 2: "Les morts ne parlents pas."
+- example 3: "Les Ambulances garés, les cortèges de defunts, les cadavres qu'on sortait des décombres"
 ---
+## Description
+This model was built to compute detect the lexical field of death. It's main purpose was to automate annotation on a specific dataset.
+There is no waranty that it  will work on any others dataset. We finetune, the camembert-base model using this code; https://github.com/psycholinguistics2125/train_NER.
 | Feature | Description |
 | --- | --- |
 | **Name** | `fr_lexical_death` |
 | `ENTS_P` | 82.54 |
 | `ENTS_R` | 73.24 |
 | `TRANSFORMER_LOSS` | 51778.17 |
+| `NER_LOSS` | 41163.78 |
+###  Training
+We constructed our dataset by manually  labeling the documents using Doccano, an open-source tool for collaborative human annotation.
+The models were trained using 200-word length sequences, 70% of the data were used for the training, 20% to test and finetune hyperparameters,
+and 10% to evaluate the performances of the model. In order to ensure correct performance evaluation,
+the evaluation sequences were taken from documents that were not used during the training.
+Tain dataset 147 labels for MORT_EXPLICITE
+Test  dataset is 35 labels for  MORT_EXPLICITE
+Valid dataset is 18 labels  for MORT_EXPLICITE