readme: add initial version of model card
#1
by
stefan-it
- opened
README.md
ADDED
@@ -0,0 +1,62 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
- ka
|
5 |
+
license: mit
|
6 |
+
tags:
|
7 |
+
- flair
|
8 |
+
- token-classification
|
9 |
+
- sequence-tagger-model
|
10 |
+
base_model: xlm-roberta-large
|
11 |
+
widget:
|
12 |
+
- text: ამით თავისი ქადაგება დაასრულა და დაბრუნდა იერუსალიმში . ერთ-ერთ გარე კედელზე
|
13 |
+
არსებობს ერნესტო ჩე გევარას პორტრეტი . შაკოსკა“ ინახება ბრაზილიაში , სან-პაულუს
|
14 |
+
ხელოვნების მუზეუმში .
|
15 |
+
---
|
16 |
+
|
17 |
+
# Fine-tuned English-Georgian NER Model with Flair
|
18 |
+
|
19 |
+
This Flair NER model was fine-tuned on the WikiANN dataset
|
20 |
+
([Rahimi et al.](https://www.aclweb.org/anthology/P19-1015) splits)
|
21 |
+
using XLM-R Large as backbone LM.
|
22 |
+
|
23 |
+
**Notice**: The dataset is very problematic, because it was automatically constructed.
|
24 |
+
|
25 |
+
We did manually inspect the development split of the Georgian data and found
|
26 |
+
a lot of bad labeled examples, e.g. DVD ( 💿 ) as `ORG`.
|
27 |
+
|
28 |
+
## Fine-Tuning
|
29 |
+
|
30 |
+
The latest
|
31 |
+
[Flair version](https://github.com/flairNLP/flair/tree/f30f5801df3f9e105ed078ec058b4e1152dd9159)
|
32 |
+
is used for fine-tuning.
|
33 |
+
|
34 |
+
We use English and Georgian training splits for fine-tuning and the
|
35 |
+
development set of Georgian for evaluation.
|
36 |
+
|
37 |
+
A hyper-parameter search over the following parameters with 5 different seeds per configuration is performed:
|
38 |
+
|
39 |
+
* Batch Sizes: [`4`]
|
40 |
+
* Learning Rates: [`5e-06`]
|
41 |
+
|
42 |
+
More details can be found in this [repository](https://github.com/stefan-it/georgian-ner).
|
43 |
+
|
44 |
+
## Results
|
45 |
+
|
46 |
+
A hyper-parameter search with 5 different seeds per configuration is performed and micro F1-score on development set
|
47 |
+
is reported:
|
48 |
+
|
49 |
+
| Configuration | Seed 1 | Seed 2 | Seed 3 | Seed 4 | Seed 5 | Average |
|
50 |
+
|-------------------|-------------|-------------|-----------------|------------|-------------|-----------------|
|
51 |
+
| `bs4-e10-lr5e-06` | [0.9005][1] | [0.9012][2] | [**0.9069**][3] | [0.905][4] | [0.9048][5] | 0.9037 ± 0.0027 |
|
52 |
+
|
53 |
+
[1]: https://hf.co/stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-1
|
54 |
+
[2]: https://hf.co/stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-2
|
55 |
+
[3]: https://hf.co/stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-3
|
56 |
+
[4]: https://hf.co/stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-4
|
57 |
+
[5]: https://hf.co/stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-5
|
58 |
+
|
59 |
+
The result in bold shows the performance of this model.
|
60 |
+
|
61 |
+
Additionally, the Flair [training log](training.log) and [TensorBoard logs](tensorboard) are also uploaded to the model
|
62 |
+
hub.
|