metadata

license: cc-by-sa-4.0
datasets:
  - cjvt/cc_gigafida
language:
  - sl
tags:
  - word case classification

language:

license: cc-by-sa-4.0

T5-slo-word-shape-corrector

This T5 model is designed to identify and correct words with incorrect shapes.

Model Output Example

Imagine we have the following Slovenian text:

Model v besedilu popravljaj besede, ki imeti nepravilno obliko.

The model might return the following text (note: predictions chosen for demonstration/explanation, not reproducibility!):

Model v besedilu popravlja besede, ki imajo nepravilno obliko.

We observe that in the input sentence, the words popravljaj and imeti are written with incorrect gender and inclination based on the context. Our model corrects them to popravlja and imajo.

More details

Testing the model with generated test sets provides the following result (combining detection and correction of words with incorrect shapes):

Precission: 0,911
Recall:0,811
F1: 0,858