DEplain
/

trimmed_longmbart_docs_apa

Text2Text Generation

text simplification

easy-to-read language

document simplification

Model card Files Files and versions Community

omarmomen commited on Jul 1, 2023

Commit

8da597c

•

1 Parent(s): 07ea877

Update README.md

Files changed (1) hide show

README.md +22 -1

README.md CHANGED Viewed

@@ -5,4 +5,25 @@ language:
 tags:
 - text simplification
 - german
----

 tags:
 - text simplification
 - german
+datasets:
+- DEplain/DEplain-APA
+metrics:
+- sari
+- bleu
+- bertscore
+library_name: transformers
+pipeline_tag: text2text-generation
+---
+# DEplain German Text Simplification
+This model belongs to the experiments done at the work of Stodden, Momen, Kallmeyer (2023). ["DEplain: A German Parallel Corpus with Intralingual Translations into Plain Language for Sentence and Document Simplification."](https://arxiv.org/abs/2305.18939) In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, Canada. Association for Computational Linguistics.
+Detailed documentation can be found on this GitHub repository [https://github.com/rstodden/DEPlain](https://github.com/rstodden/DEPlain)
+### Model Description
+The model is a finetuned checkpoint of the pre-trained LongmBART model based on `mbart-large-cc25`. With a trimmed vocabulary to the most frequent 30k words in the German language.
+The model was finetuned towards the task of German text simplification of documents.
+The finetuning dataset included manually aligned sentences from the datasets `DEplain-APA-doc` only.