--- language: pl tags: - T5 - lemmatization license: apache-2.0 --- # PoLemma Large PoLemma models are intended for lemmatization of named entities and multi-word expressions in the Polish language. They were fine-tuned from the allegro/plT5 models, e.g.: [allegro/plt5-large](https://huggingface.co/allegro/plt5-large). ## Usage Sample usage: ``` from transformers import pipeline pipe = pipeline(task="text2text-generation", model="amu-cai/polemma-base", tokenizer="amu-cai/polemma-base") hyp = [res['generated_text'] for res in pipe(["federalnego urzędu statystycznego"], clean_up_tokenization_spaces=True, num_beams=5)][0] ``` ## Evaluation results Lemmatization Exact Match was computed on the SlavNER 2021 test set. | Model | Exact Match || | :------ | ------: | ------: | | [polemma-large]() | 92.61 | | [polemma-base]() | 91.34 | | [polemma-small]()| 88.46 | ## Citation If you use the model, please cite the following paper: TBD ### Framework versions - Transformers 4.26.0 - Pytorch 1.13.1.post200 - Datasets 2.9.0 - Tokenizers 0.13.2