crodri commited on
Commit
fd3f3e5
·
1 Parent(s): 19870b8

second version

Browse files
.gitattributes CHANGED
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *tfevents* filter=lfs diff=lfs merge=lfs -text
35
  es_pharmaconer_ner_trf-3.4.0-py3-none-any.whl filter=lfs diff=lfs merge=lfs -text
36
  transformer/model filter=lfs diff=lfs merge=lfs -text
 
 
34
  *tfevents* filter=lfs diff=lfs merge=lfs -text
35
  es_pharmaconer_ner_trf-3.4.0-py3-none-any.whl filter=lfs diff=lfs merge=lfs -text
36
  transformer/model filter=lfs diff=lfs merge=lfs -text
37
+ es_pharmaconer_ner_trf-any-py3-none-any.whl filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,18 +1,10 @@
1
  ---
2
-
3
- language:
4
- - es
5
-
6
  tags:
7
- - biomedical
8
- - clinical
9
- - eHR
10
  - spacy
11
  - token-classification
12
-
13
- license: apache-2.0
14
-
15
-
16
  model-index:
17
  - name: es_pharmaconer_ner_trf
18
  results:
@@ -22,112 +14,46 @@ model-index:
22
  metrics:
23
  - name: NER Precision
24
  type: precision
25
- value: 0.0.9109481404
26
  - name: NER Recall
27
  type: recall
28
  value: 0.9152631579
29
  - name: NER F Score
30
  type: f_score
31
  value: 0.9109481404
32
-
33
- widget:
34
- - text: "Se realizó estudio analítico destacando incremento de niveles de PTH y vitamina D (103,7 pg/ml y 272 ng/ml, respectivamente), atribuidos al exceso de suplementación de vitamina D."
35
- - text: "Por el hallazgo de múltiples fracturas por estrés, se procedió a estudio en nuestras consultas, realizándose análisis con función renal, calcio sérico y urinario, calcio iónico, magnesio y PTH, que fueron normales."
36
- - text: "Se solicitó una analítica que incluía hemograma, bioquímica, anticuerpos antinucleares (ANA) y serologías, examen de orina, así como biopsia de la lesión. Los resultados fueron normales, con ANA, anti-Sm, anti-RNP, anti-SSA, anti-SSB, anti-Jo1 y anti-Scl70 negativos."
37
-
38
  ---
39
- #
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
- ## Table of Contents
42
  <details>
43
- <summary>Click to expand</summary>
44
-
45
- - [Overview](#overview)
46
- - [Model description](#model-description)
47
- - [How to use](#how-to-use)
48
- - [Intended uses and limitations](#intended-uses-and-limitations)
49
- - [Training](#training)
50
- - [Training data](#training-data)
51
- - [Training procedure](#training-procedure)
52
- - [Evaluation](#evaluation)
53
- - [Evaluation results](#evaluation-results)
54
- - [Additional information](#additional-information)
55
- - [Contact information](#contact-information)
56
- - [Copyright](#copyright)
57
- - [Licensing information](#licensing-information)
58
- - [Funding](#funding)
59
- - [Citation information](#citation-information)
60
- - [Disclaimer](#disclaimer)
61
-
62
- </details>
63
-
64
- ## Overview
65
- - **Architecture:**
66
- - **Language:**
67
- - **Task:**
68
- - **Data:**
69
-
70
- ## Model description
71
- Basic Spacy BioNER pipeline, with a RoBERTa-based model [bsc-bio-ehr-es] (https://huggingface.co/PlanTL-GOB-ES/bsc-bio-ehr-es) and a dataset, Pharmaconer, a NER dataset annotated with substances, compounds and proteins entities. For further information, check the [official website](https://temu.bsc.es/pharmaconer/). Visit our [GitHub repository](https://github.com/PlanTL-GOB-ES/lm-biomedical-clinical-es). This work was funded by the Spanish State Secretariat for Digitalization and Artificial Intelligence (SEDIA) within the framework of the Plan-TL",
72
- "author":"The Text Mining Unit from Barcelona Supercomputing Center.
73
-
74
-
75
- ## Intended uses and limitations
76
-
77
-
78
- ## How to use
79
-
80
-
81
- ## Limitations and bias
82
-
83
 
84
- ## Training
85
 
86
- ### Training data
 
 
87
 
88
- ### Training procedure
89
-
90
-
91
- ## Evaluation
92
-
93
- ### Evaluation results
94
-
95
-
96
- ## Additional information
97
-
98
- ### Author
99
- Text Mining Unit (TeMU) at the Barcelona Supercomputing Center (bsc-temu@bsc.es)
100
-
101
- ### Contact information
102
- For further information, send an email to <plantl-gob-es@bsc.es>
103
-
104
- ### Copyright
105
- Copyright by the Spanish State Secretariat for Digitalization and Artificial Intelligence (SEDIA) (2022)
106
-
107
- ### Licensing information
108
- [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
109
-
110
- ### Funding
111
- This work was funded by the Spanish State Secretariat for Digitalization and Artificial Intelligence (SEDIA) within the framework of the Plan-TL.
112
-
113
-
114
- ### Citation Information
115
-
116
- ### Disclaimer
117
-
118
- <details>
119
- <summary>Click to expand</summary>
120
-
121
- The models published in this repository are intended for a generalist purpose and are available to third parties. These models may have bias and/or any other undesirable distortions.
122
-
123
- When third parties, deploy or provide systems and/or services to other parties using any of these models (or using systems based on these models) or become users of the models, they should note that it is their responsibility to mitigate the risks arising from their use and, in any event, to comply with applicable regulations, including regulations regarding the use of Artificial Intelligence.
124
-
125
- In no event shall the owner of the models (SEDIA – State Secretariat for Digitalization and Artificial Intelligence) nor the creator (BSC – Barcelona Supercomputing Center) be liable for any results arising from the use made by third parties of these models.
126
-
127
-
128
- Los modelos publicados en este repositorio tienen una finalidad generalista y están a disposición de terceros. Estos modelos pueden tener sesgos y/u otro tipo de distorsiones indeseables.
129
 
130
- Cuando terceros desplieguen o proporcionen sistemas y/o servicios a otras partes usando alguno de estos modelos (o utilizando sistemas basados en estos modelos) o se conviertan en usuarios de los modelos, deben tener en cuenta que es su responsabilidad mitigar los riesgos derivados de su uso y, en todo caso, cumplir con la normativa aplicable, incluyendo la normativa en materia de uso de inteligencia artificial.
131
 
132
- En ningún caso el propietario de los modelos (SEDIA – Secretaría de Estado de Digitalización e Inteligencia Artificial) ni el creador (BSC – Barcelona Supercomputing Center) serán responsables de los resultados derivados del uso que hagan terceros de estos modelos.
133
- </details>
 
 
 
 
 
 
1
  ---
 
 
 
 
2
  tags:
 
 
 
3
  - spacy
4
  - token-classification
5
+ language:
6
+ - es
7
+ license: mit
 
8
  model-index:
9
  - name: es_pharmaconer_ner_trf
10
  results:
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.9066736184
18
  - name: NER Recall
19
  type: recall
20
  value: 0.9152631579
21
  - name: NER F Score
22
  type: f_score
23
  value: 0.9109481404
 
 
 
 
 
 
24
  ---
25
+ Basic Spacy BioNER pipeline, with a RoBERTa-based model [bsc-bio-ehr-es] (https://huggingface.co/PlanTL-GOB-ES/bsc-bio-ehr-es) and a dataset, Pharmaconer, a NER dataset annotated with substances, compounds and proteins entities. For further information, check the [official website](https://temu.bsc.es/pharmaconer/). Visit our [GitHub repository](https://github.com/PlanTL-GOB-ES/lm-biomedical-clinical-es). This work was funded by the Spanish State Secretariat for Digitalization and Artificial Intelligence (SEDIA) within the framework of the Plan-TL
26
+
27
+ | Feature | Description |
28
+ | --- | --- |
29
+ | **Name** | `es_pharmaconer_ner_trf` |
30
+ | **Version** | `3.4.1` |
31
+ | **spaCy** | `>=3.4.1,<3.5.0` |
32
+ | **Default Pipeline** | `transformer`, `ner` |
33
+ | **Components** | `transformer`, `ner` |
34
+ | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
35
+ | **Sources** | n/a |
36
+ | **License** | `mit` |
37
+ | **Author** | [The Text Mining Unit from Barcelona Supercomputing Center.](https://huggingface.co/PlanTL-GOB-ES/) |
38
+
39
+ ### Label Scheme
40
 
 
41
  <details>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
+ <summary>View label scheme (4 labels for 1 components)</summary>
44
 
45
+ | Component | Labels |
46
+ | --- | --- |
47
+ | **`ner`** | `NORMALIZABLES`, `NO_NORMALIZABLES`, `PROTEINAS`, `UNCLEAR` |
48
 
49
+ </details>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
+ ### Accuracy
52
 
53
+ | Type | Score |
54
+ | --- | --- |
55
+ | `ENTS_F` | 91.09 |
56
+ | `ENTS_P` | 90.67 |
57
+ | `ENTS_R` | 91.53 |
58
+ | `TRANSFORMER_LOSS` | 15719.51 |
59
+ | `NER_LOSS` | 22469.88 |
es_pharmaconer_ner_trf-any-py3-none-any.whl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ed5f08f4df81f25c5f007e59d0b8bd98541f6e0c13572aa347b87858690a9ca9
3
+ size 440440035
meta.json CHANGED
@@ -1,20 +1,19 @@
1
  {
2
  "lang":"es",
3
  "name":"pharmaconer_ner_trf",
4
- "version":"3.4.0",
5
- "spacy_version":">=3.4.1,<3.5.0",
6
  "description":"Basic Spacy BioNER pipeline, with a RoBERTa-based model [bsc-bio-ehr-es] (https://huggingface.co/PlanTL-GOB-ES/bsc-bio-ehr-es) and a dataset, Pharmaconer, a NER dataset annotated with substances, compounds and proteins entities. For further information, check the [official website](https://temu.bsc.es/pharmaconer/). Visit our [GitHub repository](https://github.com/PlanTL-GOB-ES/lm-biomedical-clinical-es). This work was funded by the Spanish State Secretariat for Digitalization and Artificial Intelligence (SEDIA) within the framework of the Plan-TL",
7
  "author":"The Text Mining Unit from Barcelona Supercomputing Center.",
8
  "email":"plantl-gob-es@bsc.es",
9
  "url":"https://huggingface.co/PlanTL-GOB-ES/",
10
- "license":"Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)",
 
11
  "spacy_git_version":"Unknown",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
15
  "keys":0,
16
- "name":null,
17
- "mode":"default"
18
  },
19
  "labels":{
20
  "transformer":[
@@ -66,5 +65,8 @@
66
  },
67
  "transformer_loss":157.1950596615,
68
  "ner_loss":224.6988260417
69
- }
70
- }
 
 
 
 
1
  {
2
  "lang":"es",
3
  "name":"pharmaconer_ner_trf",
4
+ "version":"3.4.1",
 
5
  "description":"Basic Spacy BioNER pipeline, with a RoBERTa-based model [bsc-bio-ehr-es] (https://huggingface.co/PlanTL-GOB-ES/bsc-bio-ehr-es) and a dataset, Pharmaconer, a NER dataset annotated with substances, compounds and proteins entities. For further information, check the [official website](https://temu.bsc.es/pharmaconer/). Visit our [GitHub repository](https://github.com/PlanTL-GOB-ES/lm-biomedical-clinical-es). This work was funded by the Spanish State Secretariat for Digitalization and Artificial Intelligence (SEDIA) within the framework of the Plan-TL",
6
  "author":"The Text Mining Unit from Barcelona Supercomputing Center.",
7
  "email":"plantl-gob-es@bsc.es",
8
  "url":"https://huggingface.co/PlanTL-GOB-ES/",
9
+ "license":"mit",
10
+ "spacy_version":">=3.4.1,<3.5.0",
11
  "spacy_git_version":"Unknown",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
15
  "keys":0,
16
+ "name":null
 
17
  },
18
  "labels":{
19
  "transformer":[
 
65
  },
66
  "transformer_loss":157.1950596615,
67
  "ner_loss":224.6988260417
68
+ },
69
+ "requirements":[
70
+ "spacy-transformers>=1.1.8,<1.2.0"
71
+ ]
72
+ }