Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,21 @@ model-index:
|
|
20 |
- name: NER F Score
|
21 |
type: f_score
|
22 |
value: 0.8546448087
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
| Feature | Description |
|
25 |
| --- | --- |
|
26 |
| **Name** | `fr_sensations_and_body` |
|
@@ -52,5 +66,17 @@ model-index:
|
|
52 |
| `ENTS_F` | 85.46 |
|
53 |
| `ENTS_P` | 85.37 |
|
54 |
| `ENTS_R` | 85.56 |
|
55 |
-
|
56 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
- name: NER F Score
|
21 |
type: f_score
|
22 |
value: 0.8546448087
|
23 |
+
|
24 |
+
widget:
|
25 |
+
- text: "Il y avait du sang partout, less bras et less jambes n'étaient plus aux bons endroits."
|
26 |
+
example_title: "corps"
|
27 |
+
- text: "J'était un peu dans le coton."
|
28 |
+
example_title: "Sensations physiques"
|
29 |
+
- text: "Il y avait commme un silence assourdissant. Et là j'ai vu la beauté du lévé de soleil."
|
30 |
+
example_title: "Perceptions"
|
31 |
---
|
32 |
+
|
33 |
+
This model was built to compute detect the lexical field of body, physical sensation and perception.
|
34 |
+
It's main purpose was to automate annotation on a specific dataset.
|
35 |
+
There is no waranty that it will work on any others dataset.
|
36 |
+
We finetune, the camembert-base model using this code; https://github.com/psycholinguistics2125/train_NER.
|
37 |
+
|
38 |
| Feature | Description |
|
39 |
| --- | --- |
|
40 |
| **Name** | `fr_sensations_and_body` |
|
|
|
66 |
| `ENTS_F` | 85.46 |
|
67 |
| `ENTS_P` | 85.37 |
|
68 |
| `ENTS_R` | 85.56 |
|
69 |
+
|
70 |
+
### Training
|
71 |
+
|
72 |
+
We constructed our dataset by manually labeling the documents using Doccano, an open-source tool for collaborative human annotation.
|
73 |
+
The models were trained using 200-word length sequences, 70% of the data were used for the training, 20% to test and finetune hyperparameters, and 10% to evaluate the performances of the model.
|
74 |
+
In order to ensure correct performance evaluation, the evaluation sequences were taken from documents that were not used during the training.
|
75 |
+
|
76 |
+
|
77 |
+
| label | train | test | valid |
|
78 |
+
| --- | --- |--- |--- |
|
79 |
+
| `CORPS`| 523 | 152 | 106 |
|
80 |
+
| `MOTS_PERCEPTIONS_SENSORIELLES`| 250 | 108 | 82 |
|
81 |
+
| `SENSATIONS_PHYSIQUES`|91 | 38 | 31|
|
82 |
+
| `VERB_PERCEPTIONS_SENSORIELLES` |617|162 | 137 |
|