Update README.md
Browse files
README.md
CHANGED
@@ -35,8 +35,16 @@ It achieves the following results on the evaluation set:
|
|
35 |
- F1: 0.8638
|
36 |
|
37 |
## Model description
|
|
|
38 |
|
39 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
|
41 |
## Intended uses & limitations
|
42 |
|
@@ -75,4 +83,4 @@ The following hyperparameters were used during training:
|
|
75 |
- Transformers 4.26.1
|
76 |
- Pytorch 1.13.1+cu116
|
77 |
- Datasets 2.10.0
|
78 |
-
- Tokenizers 0.13.2
|
|
|
35 |
- F1: 0.8638
|
36 |
|
37 |
## Model description
|
38 |
+
Multilingual Named Entity Recognition across several languages
|
39 |
|
40 |
+
|
41 |
+
For this project's token classification, I built a unique custom model head.
|
42 |
+
WikiANN or PAN-X.2, a subset of the Cross-lingual TRansfer Evaluation of Multilingual
|
43 |
+
Encoders (XTREME) benchmark, was applied. This project was completed for a customer based
|
44 |
+
in switzerland, where the four languages that are most frequently spoken are
|
45 |
+
German (62.9% of articles), French (22.9%), Italian (8.4%), and English (5.9%).
|
46 |
+
Each article is tagged with "inside-outside-beginning" (IOB2) tags for LOC (place),
|
47 |
+
PER (person), and ORG (organization).
|
48 |
|
49 |
## Intended uses & limitations
|
50 |
|
|
|
83 |
- Transformers 4.26.1
|
84 |
- Pytorch 1.13.1+cu116
|
85 |
- Datasets 2.10.0
|
86 |
+
- Tokenizers 0.13.2
|