File size: 3,503 Bytes
9a0b0c1
 
 
 
 
 
 
3ea7fec
 
 
9a0b0c1
 
 
 
4b9a8ac
3ea7fec
9a0b0c1
 
 
 
09e9de5
9a0b0c1
67e4fdd
9a0b0c1
 
 
 
 
 
 
 
 
 
 
67e4fdd
9a0b0c1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
036a1e8
9a0b0c1
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
---
language:
- bar
library_name: flair
pipeline_tag: token-classification
base_model: deepset/gbert-large
widget:
- text: >-
    Dochau ( amtli : Dochau ) is a Grouße Kroasstod in Obabayern nordwestli vo
    Minga und liagt im gleichnoming Landkroas .
tags:
- flair
- token-classification
- sequence-tagger-model
- arxiv:2403.12749
license: apache-2.0
---

# Flair NER Model for Recognizing Named Entities in Bavarian Dialectal Data (Wikipedia)

This (unofficial) Flair NER model was trained on annotated Bavarian Wikipedia articles from the BarNER dataset that was proposed in the ["Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data"](https://aclanthology.org/2024.lrec-main.1262/) LREC-COLING 2024 paper (and on [arXiv](https://arxiv.org/abs/2403.12749)) by Siyao Peng, Zihang Sun, Huangyan Shan, Marie Kolm, Verena Blaschke, Ekaterina Artemova and Barbara Plank.

The [released dataset](https://github.com/mainlp/BarNER) is used in the *coarse* setting that is shown in Table 3 in the paper. The following Named Entities are available:

* `PER`
* `LOC`
* `ORG`
* `MISC`

## Fine-Tuning

We perform a hyper-parameter search over the following parameters:

* Batch Sizes: `[32, 16]`
* Learning Rates: `[7e-06, 8e-06, 9e-06, 1e-05]`
* Epochs: `[20]`
* Subword Pooling: `[first]`

As base model we use [GBERT Large](https://huggingface.co/deepset/gbert-large). We use three different seeds to report the averaged F1-Score on the development set:

| Configuration      |   Run 1 |   Run 2 |   Run 3 | Avg.         |
|:-------------------|:--------|:--------|:--------|:-------------|
| `bs32-e20-lr1e-05` |   76.96 |   77    | **77.71** | 77.22 ± 0.34 |
| `bs32-e20-lr8e-06` |   76.75 |   76.21 |   77.38 | 76.78 ± 0.48 |
| `bs16-e20-lr1e-05` |   76.81 |   76.29 |   76.02 | 76.37 ± 0.33 |
| `bs32-e20-lr7e-06` |   75.44 |   76.71 |   75.9  | 76.02 ± 0.52 |
| `bs32-e20-lr9e-06` |   75.69 |   75.99 |   76.2  | 75.96 ± 0.21 |
| `bs16-e20-lr8e-06` |   74.82 |   76.83 |   76.14 | 75.93 ± 0.83 |
| `bs16-e20-lr7e-06` |   76.77 |   74.82 |   76.04 | 75.88 ± 0.8  |
| `bs16-e20-lr9e-06` |   76.55 |   74.25 |   76.54 | 75.78 ± 1.08 |

The hyper-parameter configuration `bs32-e20-lr1e-05` yields to best results on the development set and we use this configuration to report the averaged F1-Score on the test set:

| Configuration      |   Run 1 |   Run 2 |   Run 3 | Avg.         |
|:-------------------|:--------|:--------|:--------|:-------------|
| `bs32-e20-lr1e-05` |    72.1 |   74.33 | **72.97** | 73.13 ± 0.92 |

Our averaged result on test set is higher than the reported 72.17 in the original paper (see Table 5, in-domain training results).

For upload we used the best performing model on the development set, which is marked in bold. It achieves 72.97 on final test set.

# Flair Demo

The following snippet shows how to use the CleanCoNLL NER models with Flair:

```python
from flair.data import Sentence
from flair.models import SequenceTagger

# load tagger
tagger = SequenceTagger.load("stefan-it/flair-barner-wiki-coarse-gbert-large")

# make example sentence
sentence = Sentence("Dochau ( amtli : Dochau ) is a Grouße Kroasstod in Obabayern nordwestli vo Minga und liagt im gleichnoming Landkroas .")

# predict NER tags
tagger.predict(sentence)

# print sentence
print(sentence)

# print predicted NER spans
print('The following NER tags are found:')
# iterate over entities and print
for entity in sentence.get_spans('ner'):
    print(entity)
```