The SwissBERT model fine-tuned on the WikiNEuRal dataset for multilingual NER.
Supports German, French and Italian as supervised languages and Romansh Grischun as a zero-shot language.
Usage
from transformers import pipeline
token_classifier = pipeline(
model="ZurichNLP/swissbert-ner",
aggregation_strategy="simple",
)
German example
token_classifier.model.set_default_language("de_CH")
token_classifier("Mein Name sei Gantenbein.")
Output:
[{'entity_group': 'PER',
'score': 0.5002625,
'word': 'Gantenbein',
'start': 13,
'end': 24}]
French example
token_classifier.model.set_default_language("fr_CH")
token_classifier("J'habite à Lausanne.")
Output:
[{'entity_group': 'LOC',
'score': 0.99955386,
'word': 'Lausanne',
'start': 10,
'end': 19}]
Citation
@article{vamvas-etal-2023-swissbert,
title={Swiss{BERT}: The Multilingual Language Model for Switzerland},
author={Jannis Vamvas and Johannes Gra\"en and Rico Sennrich},
year={2023},
eprint={2303.13310},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2303.13310}
}
- Downloads last month
- 257
Inference API (serverless) has been turned off for this model.