pl-diachronic-normalizer

AI & ML interests

None defined yet.

Legend:

  • pruned datasets are reduced in size to contain only examples in which the source paragraph and the target paragraph are not identical
  • hard datasets have their training and test split created from separate pools of books with no overlap (so all paragraphs from a given book are contained in only a single split)
  • transduced datasets have their training split processed by a rule-based normalizer

Models were accordingly created based on the 4 dataset variants.

Evaluation repositories:

https://github.com/kedudzic/pl-normalizer-evaluation (private)

https://github.com/kedudzic/pl-normalizer-evaluation-just-results (public)

models

None public yet

datasets

None public yet