Edit model card

This model was built to compute detect the lexical field of body, physical sensation and perception. It's main purpose was to automate annotation on a specific dataset. There is no waranty that it will work on any others dataset. We finetune, the camembert-base model using this code; https://github.com/psycholinguistics2125/train_NER.

Feature Description
Name fr_sensations_and_body
Version 0.0.1
spaCy >=3.4.4,<3.5.0
Default Pipeline transformer, ner
Components transformer, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources n/a
License n/a
Author n/a

Label Scheme

View label scheme (4 labels for 1 components)
Component Labels
ner CORPS, MOTS_PERCEPTIONS_SENSORIELLES, SENSATIONS_PHYSIQUES, VERB_PERCEPTIONS_SENSORIELLES

Accuracy

Type Score
ENTS_F 85.46
ENTS_P 85.37
ENTS_R 85.56

Training

We constructed our dataset by manually labeling the documents using Doccano, an open-source tool for collaborative human annotation. The models were trained using 200-word length sequences, 70% of the data were used for the training, 20% to test and finetune hyperparameters, and 10% to evaluate the performances of the model. In order to ensure correct performance evaluation, the evaluation sequences were taken from documents that were not used during the training.

label train test valid
CORPS 523 152 106
MOTS_PERCEPTIONS_SENSORIELLES 250 108 82
SENSATIONS_PHYSIQUES 91 38 31
VERB_PERCEPTIONS_SENSORIELLES 617 162 137
Downloads last month
8

Evaluation results