Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,19 @@ model-index:
|
|
20 |
- name: NER F Score
|
21 |
type: f_score
|
22 |
value: 0.776119403
|
|
|
|
|
|
|
|
|
|
|
23 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
| Feature | Description |
|
25 |
| --- | --- |
|
26 |
| **Name** | `fr_lexical_death` |
|
@@ -53,4 +65,18 @@ model-index:
|
|
53 |
| `ENTS_P` | 82.54 |
|
54 |
| `ENTS_R` | 73.24 |
|
55 |
| `TRANSFORMER_LOSS` | 51778.17 |
|
56 |
-
| `NER_LOSS` | 41163.78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
- name: NER F Score
|
21 |
type: f_score
|
22 |
value: 0.776119403
|
23 |
+
license: agpl-3.0
|
24 |
+
widget:
|
25 |
+
- example 1: "Il faut pas sortir, vous reviendrez pas vivantes."
|
26 |
+
- example 2: "Les morts ne parlents pas."
|
27 |
+
- example 3: "Les Ambulances garés, les cortèges de defunts, les cadavres qu'on sortait des décombres"
|
28 |
---
|
29 |
+
|
30 |
+
## Description
|
31 |
+
|
32 |
+
This model was built to compute detect the lexical field of death. It's main purpose was to automate annotation on a specific dataset.
|
33 |
+
There is no waranty that it will work on any others dataset. We finetune, the camembert-base model using this code; https://github.com/psycholinguistics2125/train_NER.
|
34 |
+
|
35 |
+
|
36 |
| Feature | Description |
|
37 |
| --- | --- |
|
38 |
| **Name** | `fr_lexical_death` |
|
|
|
65 |
| `ENTS_P` | 82.54 |
|
66 |
| `ENTS_R` | 73.24 |
|
67 |
| `TRANSFORMER_LOSS` | 51778.17 |
|
68 |
+
| `NER_LOSS` | 41163.78 |
|
69 |
+
|
70 |
+
### Training
|
71 |
+
|
72 |
+
We constructed our dataset by manually labeling the documents using Doccano, an open-source tool for collaborative human annotation.
|
73 |
+
The models were trained using 200-word length sequences, 70% of the data were used for the training, 20% to test and finetune hyperparameters,
|
74 |
+
and 10% to evaluate the performances of the model. In order to ensure correct performance evaluation,
|
75 |
+
the evaluation sequences were taken from documents that were not used during the training.
|
76 |
+
|
77 |
+
Tain dataset 147 labels for MORT_EXPLICITE
|
78 |
+
|
79 |
+
Test dataset is 35 labels for MORT_EXPLICITE
|
80 |
+
|
81 |
+
Valid dataset is 18 labels for MORT_EXPLICITE
|
82 |
+
|