Update README.md
Browse files
README.md
CHANGED
@@ -32,9 +32,10 @@ domain that shows 2.54 points of F1 score improvement on average on different bi
|
|
32 |
entity recognition tasks.
|
33 |
|
34 |
- **Developed by:** Rian Touchent, Eric Villemonte de La Clergerie
|
|
|
35 |
- **License:** MIT
|
36 |
|
37 |
-
|
38 |
|
39 |
<!-- Provide the basic links for the model. -->
|
40 |
|
@@ -51,9 +52,9 @@ entity recognition tasks.
|
|
51 |
|
52 |
| **Corpus** | **Details** | **Size** |
|
53 |
|------------|--------------------------------------------------------------------|------------|
|
54 |
-
| ISTEX |
|
55 |
-
| CLEAR |
|
56 |
-
| E3C |
|
57 |
| Total | | 413 M |
|
58 |
|
59 |
|
@@ -87,7 +88,7 @@ To ensure reliability, we averaged over 10 evaluations with different seeds.
|
|
87 |
|
88 |
| Style | Dataset | Score | CamemBERT | CamemBERT |
|
89 |
| :----------- | :------ | :---- | :---------------: | :-------------------: |
|
90 |
-
|
|
91 |
| | | P | 70\.12 ~~±~~ 1.93 | 71\.71 ~~±~~ 1.61 |
|
92 |
| | | R | 70\.89 ~~±~~ 1.78 | **74\.42 ~~±~~ 1.49** |
|
93 |
| | CAS2 | F1 | 79\.02 ~~±~~ 0.92 | **81\.66 ~~±~~ 0.59** |
|
@@ -96,20 +97,18 @@ To ensure reliability, we averaged over 10 evaluations with different seeds.
|
|
96 |
| | E3C | F1 | 67\.63 ~~±~~ 1.45 | **69\.85 ~~±~~ 1.58** |
|
97 |
| | | P | 78\.19 ~~±~~ 0.72 | **79\.11 ~~±~~ 0.42** |
|
98 |
| | | R | 59\.61 ~~±~~ 2.25 | **62\.56 ~~±~~ 2.50** |
|
99 |
-
|
|
100 |
| | | P | 74\.62 ~~±~~ 1.97 | **76\.92 ~~±~~ 1.96** |
|
101 |
| | | R | 73\.68 ~~±~~ 2.22 | **76\.52 ~~±~~ 1.62** |
|
102 |
-
|
|
103 |
| | | P | 64\.94 ~~±~~ 0.82 | **67\.77 ~~±~~ 0.88** |
|
104 |
| | | R | 66\.56 ~~±~~ 0.56 | **69\.21 ~~±~~ 1.32** |
|
105 |
|
106 |
|
107 |
-
## Environmental Impact
|
108 |
|
109 |
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
110 |
|
111 |
-
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
|
112 |
-
|
113 |
- **Hardware Type:** 2 x Tesla V100
|
114 |
- **Hours used:** 39 hours
|
115 |
- **Provider:** INRIA clusters
|
|
|
32 |
entity recognition tasks.
|
33 |
|
34 |
- **Developed by:** Rian Touchent, Eric Villemonte de La Clergerie
|
35 |
+
- **Logo by:** Alix Chagué
|
36 |
- **License:** MIT
|
37 |
|
38 |
+
<!-- ### Model Sources [optional] -->
|
39 |
|
40 |
<!-- Provide the basic links for the model. -->
|
41 |
|
|
|
52 |
|
53 |
| **Corpus** | **Details** | **Size** |
|
54 |
|------------|--------------------------------------------------------------------|------------|
|
55 |
+
| ISTEX | diverse scientific literature indexed on ISTEX | 276 M |
|
56 |
+
| CLEAR | drug leaflets | 73 M |
|
57 |
+
| E3C | various documents from journals, drug leaflets, and clinical cases | 64 M |
|
58 |
| Total | | 413 M |
|
59 |
|
60 |
|
|
|
88 |
|
89 |
| Style | Dataset | Score | CamemBERT | CamemBERT |
|
90 |
| :----------- | :------ | :---- | :---------------: | :-------------------: |
|
91 |
+
| Clinical | CAS1 | F1 | 70\.50 ~~±~~ 1.75 | **73\.03 ~~±~~ 1.29** |
|
92 |
| | | P | 70\.12 ~~±~~ 1.93 | 71\.71 ~~±~~ 1.61 |
|
93 |
| | | R | 70\.89 ~~±~~ 1.78 | **74\.42 ~~±~~ 1.49** |
|
94 |
| | CAS2 | F1 | 79\.02 ~~±~~ 0.92 | **81\.66 ~~±~~ 0.59** |
|
|
|
97 |
| | E3C | F1 | 67\.63 ~~±~~ 1.45 | **69\.85 ~~±~~ 1.58** |
|
98 |
| | | P | 78\.19 ~~±~~ 0.72 | **79\.11 ~~±~~ 0.42** |
|
99 |
| | | R | 59\.61 ~~±~~ 2.25 | **62\.56 ~~±~~ 2.50** |
|
100 |
+
| Drug leaflets | EMEA | F1 | 74\.14 ~~±~~ 1.95 | **76\.71 ~~±~~ 1.50** |
|
101 |
| | | P | 74\.62 ~~±~~ 1.97 | **76\.92 ~~±~~ 1.96** |
|
102 |
| | | R | 73\.68 ~~±~~ 2.22 | **76\.52 ~~±~~ 1.62** |
|
103 |
+
| Scientific | MEDLINE | F1 | 65\.73 ~~±~~ 0.40 | **68\.47 ~~±~~ 0.54** |
|
104 |
| | | P | 64\.94 ~~±~~ 0.82 | **67\.77 ~~±~~ 0.88** |
|
105 |
| | | R | 66\.56 ~~±~~ 0.56 | **69\.21 ~~±~~ 1.32** |
|
106 |
|
107 |
|
108 |
+
## Environmental Impact estimation
|
109 |
|
110 |
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
111 |
|
|
|
|
|
112 |
- **Hardware Type:** 2 x Tesla V100
|
113 |
- **Hours used:** 39 hours
|
114 |
- **Provider:** INRIA clusters
|