jeanpoll commited on
Commit
6de7397
2 Parent(s): d37bc50 bcb19a1

Med because the remote contains work that you do

Browse files

hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' erge branch 'main' of https://huggingface.co/Jean-Baptiste/camembert-ner into main

Updating local with changes done in readme.md

Files changed (1) hide show
  1. README.md +104 -0
README.md ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: fr
3
+ datasets:
4
+ - Jean-Baptiste/wikiner_fr
5
+ widget:
6
+ - text: "Je m'appelle Jean-Baptiste et je vis à Paris"
7
+ ---
8
+
9
+ # camembert-ner: model fine-tuned from camemBERT for NER task.
10
+
11
+ ## Introduction
12
+
13
+ [camembert-ner] is a NER model that was fine-tuned from camemBERT on wikiner-fr dataset.
14
+ Model was trained on subset of wikiner-fr dataset (~36 000 sentences)
15
+
16
+
17
+ ## How to use camembert-ner with HuggingFace
18
+
19
+ ##### Load camembert-ner and its sub-word tokenizer :
20
+
21
+ ```python
22
+ from transformers import AutoTokenizer, AutoModelForTokenClassification
23
+
24
+ tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/camembert-ner")
25
+ model = AutoModelForTokenClassification.from_pretrained("Jean-Baptiste/camembert-ner")
26
+
27
+
28
+ ##### Process text sample (from wikipedia)
29
+
30
+ from transformers import pipeline
31
+
32
+ nlp = pipeline('ner', model=model, tokenizer=tokenizer, grouped_entities=True)
33
+ nlp("Apple est créée le 1er avril 1976 dans le garage de la maison d'enfance de Steve Jobs à Los Altos en Californie par Steve Jobs, Steve Wozniak et Ronald Wayne14, puis constituée sous forme de société le 3 janvier 1977 à l'origine sous le nom d'Apple Computer, mais pour ses 30 ans et pour refléter la diversification de ses produits, le mot « computer » est retiré le 9 janvier 2015.")
34
+
35
+
36
+ [{'entity_group': 'ORG',
37
+ 'score': 0.9472818374633789,
38
+ 'word': 'Apple',
39
+ 'start': 0,
40
+ 'end': 5},
41
+ {'entity_group': 'PER',
42
+ 'score': 0.9838564991950989,
43
+ 'word': 'Steve Jobs',
44
+ 'start': 74,
45
+ 'end': 85},
46
+ {'entity_group': 'LOC',
47
+ 'score': 0.9831605950991312,
48
+ 'word': 'Los Altos',
49
+ 'start': 87,
50
+ 'end': 97},
51
+ {'entity_group': 'LOC',
52
+ 'score': 0.9834540486335754,
53
+ 'word': 'Californie',
54
+ 'start': 100,
55
+ 'end': 111},
56
+ {'entity_group': 'PER',
57
+ 'score': 0.9841555754343668,
58
+ 'word': 'Steve Jobs',
59
+ 'start': 115,
60
+ 'end': 126},
61
+ {'entity_group': 'PER',
62
+ 'score': 0.9843501806259155,
63
+ 'word': 'Steve Wozniak',
64
+ 'start': 127,
65
+ 'end': 141},
66
+ {'entity_group': 'PER',
67
+ 'score': 0.9841533899307251,
68
+ 'word': 'Ronald Wayne',
69
+ 'start': 144,
70
+ 'end': 157},
71
+ {'entity_group': 'ORG',
72
+ 'score': 0.9468960364659628,
73
+ 'word': 'Apple Computer',
74
+ 'start': 243,
75
+ 'end': 257}]
76
+
77
+ ```
78
+
79
+
80
+ ## Model performances (metric: seqeval)
81
+
82
+ Global
83
+ ```
84
+ 'precision': 0.8830965723967158
85
+ 'recall': 0.8915789473684211
86
+ 'f1': 0.8873174883781837
87
+ ```
88
+
89
+ By entity
90
+ ```
91
+ 'LOC': {'precision': 0.8701754385964913,
92
+ 'recall': 0.8878281622911695,
93
+ 'f1': 0.8789131718842291},
94
+ 'MISC': {'precision': 0.831053901850362,
95
+ 'recall': 0.815955766192733,
96
+ 'f1': 0.823435631725787},
97
+ 'ORG': {'precision': 0.8620199146514936,
98
+ 'recall': 0.8335625859697386,
99
+ 'f1': 0.8475524475524475},
100
+ 'PER': {'precision': 0.9367143476376246,
101
+ 'recall': 0.9583148558758315,
102
+ 'f1': 0.947391494958}
103
+ ```
104
+