fin-simple-mBART / README.md
annadmitrieva's picture
Update README.md
e4354ea verified
metadata
language:
  - fi
tags:
  - simplification
  - mBART
library_name: fairseq

This is a finetuned mBART model (https://github.com/facebookresearch/fairseq/tree/main/examples/mbart) suitable for Finnish sentence simplification. The checkpoint is a fairseq checkpoint. PID on Kielipankki: http://urn.fi/urn:nbn:fi:lb-2024011801.

Paper: Towards Automatic Finnish Text Simplification (Dmitrieva & Tiedemann, DeTermIt-WS 2024).

The finetuning data can be obtained here: http://urn.fi/urn:nbn:fi:lb-2024011703. If you wish to replicate the results, you can find the training, validation, and testing sentence pairs' ids in the "splits.zip" archive in this repository. The ids contain the following information: "{regular text id}__{simple text id}__{sentence pair number}".