T-Systems-onsite
/

mt5-small-sum-de-en-v2

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

Philip May commited on Sep 16, 2021

Commit

051fdfe

·

1 Parent(s): 643acc2

Update README.md

Files changed (1) hide show

README.md +9 -9

README.md CHANGED Viewed

@@ -45,18 +45,18 @@ The MLSUM dataset has a special characteristic. In the text, the summary is ofte
 This model is trained on the following datasets:
-| Name | Language | Size | License
-|------|----------|------|--------
-| [CNN Daily - Train](https://github.com/abisee/cnn-dailymail) | en | 218,223 | The license is unclear. The data comes from CNN and Daily Mail. We assume that it may only be used for research purposes and not commercially.
-| [Extreme Summarization (XSum) - Train](https://github.com/EdinburghNLP/XSum) | en | 204,005 | The license is unclear. The data comes from BBC. We assume that it may only be used for research purposes and not commercially.
-| [MLSUM German - Train](https://github.com/ThomasScialom/MLSUM) | de | 218,043 | Usage of dataset is restricted to non-commercial research purposes only. Copyright belongs to the original copyright holders (see [here](https://github.com/ThomasScialom/MLSUM#mlsum)).
-| [SwissText 2019 - Train](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html) | de | 84,564 | The license is unclear. The data was published in the [German Text Summarization Challenge](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html). We assume that they may be used for research purposes and not commercially.
 | Language | Size
 |------|------
-| German | xxx
-| English | xxx
-| Total | xxx
 ## Evaluation on MLSUM German Test Set (no beams)

 This model is trained on the following datasets:
+| Name | Language | License
+|------|----------|--------
+| [CNN Daily - Train](https://github.com/abisee/cnn-dailymail) | en | The license is unclear. The data comes from CNN and Daily Mail. We assume that it may only be used for research purposes and not commercially.
+| [Extreme Summarization (XSum) - Train](https://github.com/EdinburghNLP/XSum) | en | The license is unclear. The data comes from BBC. We assume that it may only be used for research purposes and not commercially.
+| [MLSUM German - Train](https://github.com/ThomasScialom/MLSUM) | de | Usage of dataset is restricted to non-commercial research purposes only. Copyright belongs to the original copyright holders (see [here](https://github.com/ThomasScialom/MLSUM#mlsum)).
+| [SwissText 2019 - Train](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html) | de | The license is unclear. The data was published in the [German Text Summarization Challenge](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html). We assume that they may be used for research purposes and not commercially.
 | Language | Size
 |------|------
+| German | 302,607
+| English | 422,228
+| Total | 724,835
 ## Evaluation on MLSUM German Test Set (no beams)