crodri commited on
Commit
cce4b98
1 Parent(s): 3e82849

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -57
README.md CHANGED
@@ -4,9 +4,10 @@ tags:
4
  - token-classification
5
  language:
6
  - es
 
7
  license: mit
8
  model-index:
9
- - name: es_anonimization_core_lg
10
  results:
11
  - task:
12
  name: NER
@@ -21,54 +22,26 @@ model-index:
21
  - name: NER F Score
22
  type: f_score
23
  value: 0.6911764706
24
- - task:
25
- name: POS
26
- type: token-classification
27
- metrics:
28
- - name: POS (UPOS) Accuracy
29
- type: accuracy
30
- value: 0.0
31
- - task:
32
- name: MORPH
33
- type: token-classification
34
- metrics:
35
- - name: Morph (UFeats) Accuracy
36
- type: accuracy
37
- value: 0.0
38
- - task:
39
- name: LEMMA
40
- type: token-classification
41
- metrics:
42
- - name: Lemma Accuracy
43
- type: accuracy
44
- value: 0.0
45
- - task:
46
- name: UNLABELED_DEPENDENCIES
47
- type: token-classification
48
- metrics:
49
- - name: Unlabeled Attachment Score (UAS)
50
- type: f_score
51
- value: 0.0
52
- - task:
53
- name: LABELED_DEPENDENCIES
54
- type: token-classification
55
- metrics:
56
- - name: Labeled Attachment Score (LAS)
57
- type: f_score
58
- value: 0.0
59
- - task:
60
- name: SENTS
61
- type: token-classification
62
- metrics:
63
- - name: Sentences F-Score
64
- type: f_score
65
- value: 0.0
66
  ---
67
- This is a Spacy multilingual anonimization model, for use with BSC's AnonymizationPipeline at https://github.com/TeMU-BSC/AnonymizationPipeline. The anonymization pipeline is a library for performing sensitive data identification and posterior anonymization of the detected data in Spanish and Catalan user generated plain text.
 
 
 
 
 
 
 
 
 
 
 
 
68
 
69
  | Feature | Description |
70
  | --- | --- |
71
- | **Name** | `es_anonimization_core_lg` |
72
  | **Version** | `1.0.0` |
73
  | **spaCy** | `>=3.2.3,<4.0.0` |
74
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
@@ -96,18 +69,7 @@ This is a Spacy multilingual anonimization model, for use with BSC's Anonymizati
96
 
97
  | Type | Score |
98
  | --- | --- |
99
- | `POS_ACC` | 0.00 |
100
- | `MORPH_ACC` | 0.00 |
101
- | `MORPH_PER_FEAT` | 0.00 |
102
- | `DEP_UAS` | 0.00 |
103
- | `DEP_LAS` | 0.00 |
104
- | `DEP_LAS_PER_TYPE` | 0.00 |
105
- | `SENTS_P` | 0.00 |
106
- | `SENTS_R` | 0.00 |
107
- | `SENTS_F` | 0.00 |
108
- | `LEMMA_ACC` | 0.00 |
109
  | `ENTS_F` | 69.12 |
110
  | `ENTS_P` | 74.60 |
111
  | `ENTS_R` | 64.38 |
112
- | `TOK2VEC_LOSS` | 0.00 |
113
- | `NER_LOSS` | 26573.78 |
 
4
  - token-classification
5
  language:
6
  - es
7
+ - ca
8
  license: mit
9
  model-index:
10
+ - name: ca_anonimization_core_lg
11
  results:
12
  - task:
13
  name: NER
 
22
  - name: NER F Score
23
  type: f_score
24
  value: 0.6911764706
25
+ widget:
26
+ - text: "La matrícula del coche es 8560 JXK y el nombre del propietario es Jon Permanyer Ugartemendia, DNI 362-69-58-6n. Tel: 628539864. Calle Pasteur 46 Bajos, 08024 Barcelona"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  ---
28
+
29
+ This is a Spacy multilingual (Catalan & Spanish) anonimization model, for use with BSC's AnonymizationPipeline at:
30
+
31
+ https://github.com/TeMU-BSC/AnonymizationPipeline.
32
+
33
+ pip install https://huggingface.co/PlanTL-GOB-ES/es_anonimization_core_lg/resolve/main/es_anonimization_core_lg-any-py3-none-any.whl
34
+
35
+ The anonymization pipeline is a library for performing sensitive data identification and ultimately anonymization of the detected data in Spanish and Catalan user generated plain text.
36
+
37
+ This is not a standalone model and is meant to work within the pipeline.
38
+
39
+ The model can detect the following entities: `EMAIL`, `FINANCIAL`, `ID`, `LOC`, `MISC`, `ORG`, `PER`, `TELEPHONE`, `VEHICLE`, `ZIP`
40
+
41
 
42
  | Feature | Description |
43
  | --- | --- |
44
+ | **Name** | `ca_anonimization_core_lg` |
45
  | **Version** | `1.0.0` |
46
  | **spaCy** | `>=3.2.3,<4.0.0` |
47
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
 
69
 
70
  | Type | Score |
71
  | --- | --- |
 
 
 
 
 
 
 
 
 
 
72
  | `ENTS_F` | 69.12 |
73
  | `ENTS_P` | 74.60 |
74
  | `ENTS_R` | 64.38 |
75
+ | `NER_LOSS` | 26573.78 |