binbin83 commited on
Commit
e04e615
1 Parent(s): bec823f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -2
README.md CHANGED
@@ -20,7 +20,21 @@ model-index:
20
  - name: NER F Score
21
  type: f_score
22
  value: 0.8546448087
 
 
 
 
 
 
 
 
23
  ---
 
 
 
 
 
 
24
  | Feature | Description |
25
  | --- | --- |
26
  | **Name** | `fr_sensations_and_body` |
@@ -52,5 +66,17 @@ model-index:
52
  | `ENTS_F` | 85.46 |
53
  | `ENTS_P` | 85.37 |
54
  | `ENTS_R` | 85.56 |
55
- | `TRANSFORMER_LOSS` | 162304.44 |
56
- | `NER_LOSS` | 50389.76 |
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  - name: NER F Score
21
  type: f_score
22
  value: 0.8546448087
23
+
24
+ widget:
25
+ - text: "Il y avait du sang partout, less bras et less jambes n'étaient plus aux bons endroits."
26
+ example_title: "corps"
27
+ - text: "J'était un peu dans le coton."
28
+ example_title: "Sensations physiques"
29
+ - text: "Il y avait commme un silence assourdissant. Et là j'ai vu la beauté du lévé de soleil."
30
+ example_title: "Perceptions"
31
  ---
32
+
33
+ This model was built to compute detect the lexical field of body, physical sensation and perception.
34
+ It's main purpose was to automate annotation on a specific dataset.
35
+ There is no waranty that it will work on any others dataset.
36
+ We finetune, the camembert-base model using this code; https://github.com/psycholinguistics2125/train_NER.
37
+
38
  | Feature | Description |
39
  | --- | --- |
40
  | **Name** | `fr_sensations_and_body` |
 
66
  | `ENTS_F` | 85.46 |
67
  | `ENTS_P` | 85.37 |
68
  | `ENTS_R` | 85.56 |
69
+
70
+ ### Training
71
+
72
+ We constructed our dataset by manually labeling the documents using Doccano, an open-source tool for collaborative human annotation.
73
+ The models were trained using 200-word length sequences, 70% of the data were used for the training, 20% to test and finetune hyperparameters, and 10% to evaluate the performances of the model.
74
+ In order to ensure correct performance evaluation, the evaluation sequences were taken from documents that were not used during the training.
75
+
76
+
77
+ | label | train | test | valid |
78
+ | --- | --- |--- |--- |
79
+ | `CORPS`| 523 | 152 | 106 |
80
+ | `MOTS_PERCEPTIONS_SENSORIELLES`| 250 | 108 | 82 |
81
+ | `SENSATIONS_PHYSIQUES`|91 | 38 | 31|
82
+ | `VERB_PERCEPTIONS_SENSORIELLES` |617|162 | 137 |