File size: 3,709 Bytes

9a0b0c1
 
 
 
 
 
 
3ea7fec
 
 
9a0b0c1
 
 
 
4b9a8ac
eded4cc
08c7fc9
3ea7fec
9a0b0c1
 
 
 
d64cfa8
 
09e9de5
9a0b0c1
67e4fdd
9a0b0c1
 
 
 
 
 
 
 
 
 
 
67e4fdd
9a0b0c1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c3bf151
9a0b0c1
 
 
 
 
 
 
 
 
036a1e8
9a0b0c1

---
language:
- bar
library_name: flair
pipeline_tag: token-classification
base_model: deepset/gbert-large
widget:
- text: >-
    Dochau ( amtli : Dochau ) is a Grouße Kroasstod in Obabayern nordwestli vo
    Minga und liagt im gleichnoming Landkroas .
tags:
- flair
- token-classification
- sequence-tagger-model
- arxiv:2403.12749
- "O'zapft is!"
- 🥨
license: apache-2.0
---

# Flair NER Model for Recognizing Named Entities in Bavarian Dialectal Data (Wikipedia)

[![🥨](https://huggingface.co/stefan-it/flair-barner-wiki-coarse-gbert-large/resolve/main/logo.webp "🥨")](https://huggingface.co/stefan-it/flair-barner-wiki-coarse-gbert-large)

This (unofficial) Flair NER model was trained on annotated Bavarian Wikipedia articles from the BarNER dataset that was proposed in the ["Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data"](https://aclanthology.org/2024.lrec-main.1262/) LREC-COLING 2024 paper (and on [arXiv](https://arxiv.org/abs/2403.12749)) by Siyao Peng, Zihang Sun, Huangyan Shan, Marie Kolm, Verena Blaschke, Ekaterina Artemova and Barbara Plank.

The [released dataset](https://github.com/mainlp/BarNER) is used in the *coarse* setting that is shown in Table 3 in the paper. The following Named Entities are available:

* `PER`
* `LOC`
* `ORG`
* `MISC`

## Fine-Tuning

We perform a hyper-parameter search over the following parameters:

* Batch Sizes: `[32, 16]`
* Learning Rates: `[7e-06, 8e-06, 9e-06, 1e-05]`
* Epochs: `[20]`
* Subword Pooling: `[first]`

As base model we use [GBERT Large](https://huggingface.co/deepset/gbert-large). We use three different seeds to report the averaged F1-Score on the development set:

| Configuration      |   Run 1 |   Run 2 |   Run 3 | Avg.         |
|:-------------------|:--------|:--------|:--------|:-------------|
| `bs32-e20-lr1e-05` |   76.96 |   77    | **77.71** | 77.22 ± 0.34 |
| `bs32-e20-lr8e-06` |   76.75 |   76.21 |   77.38 | 76.78 ± 0.48 |
| `bs16-e20-lr1e-05` |   76.81 |   76.29 |   76.02 | 76.37 ± 0.33 |
| `bs32-e20-lr7e-06` |   75.44 |   76.71 |   75.9  | 76.02 ± 0.52 |
| `bs32-e20-lr9e-06` |   75.69 |   75.99 |   76.2  | 75.96 ± 0.21 |
| `bs16-e20-lr8e-06` |   74.82 |   76.83 |   76.14 | 75.93 ± 0.83 |
| `bs16-e20-lr7e-06` |   76.77 |   74.82 |   76.04 | 75.88 ± 0.8  |
| `bs16-e20-lr9e-06` |   76.55 |   74.25 |   76.54 | 75.78 ± 1.08 |

The hyper-parameter configuration `bs32-e20-lr1e-05` yields to best results on the development set and we use this configuration to report the averaged F1-Score on the test set:

| Configuration      |   Run 1 |   Run 2 |   Run 3 | Avg.         |
|:-------------------|:--------|:--------|:--------|:-------------|
| `bs32-e20-lr1e-05` |    72.1 |   74.33 | **72.97** | 73.13 ± 0.92 |

Our averaged result on test set is higher than the reported 72.17 in the original paper (see Table 5, in-domain training results).

For upload we used the best performing model on the development set, which is marked in bold. It achieves 72.97 on final test set.

# Flair Demo

The following snippet shows how to use the fine-tuned NER models with Flair:

```python
from flair.data import Sentence
from flair.models import SequenceTagger

# load tagger
tagger = SequenceTagger.load("stefan-it/flair-barner-wiki-coarse-gbert-large")

# make example sentence
sentence = Sentence("Dochau ( amtli : Dochau ) is a Grouße Kroasstod in Obabayern nordwestli vo Minga und liagt im gleichnoming Landkroas .")

# predict NER tags
tagger.predict(sentence)

# print sentence
print(sentence)

# print predicted NER spans
print('The following NER tags are found:')
# iterate over entities and print
for entity in sentence.get_spans('ner'):
    print(entity)
```