Add model card
Browse files
README.md
ADDED
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: de
|
3 |
+
tags:
|
4 |
+
- summarization
|
5 |
+
datasets:
|
6 |
+
- mlsum
|
7 |
+
---
|
8 |
+
|
9 |
+
# mT5-small fine-tuned on German MLSUM
|
10 |
+
This model was finetuned for 3 epochs with a max_len (input) of 768 tokens and target_max_len of 192 tokens.
|
11 |
+
It was fine-tuned on all German articles present in the train split of the [MLSUM dataset](https://huggingface.co/datasets/mlsum) having less than 384 "words" after splitting on whitespace, which resulted in 80249 articles.
|
12 |
+
The exact expression to filter the dataset was the following:
|
13 |
+
```python
|
14 |
+
dataset = dataset.filter(lambda e: len(e['text'].split()) < 384)
|
15 |
+
```
|
16 |
+
|
17 |
+
## Evaluation results
|
18 |
+
The fine-tuned model was evaluated on 2000 random articles from the validation set.
|
19 |
+
Mean [f1 ROUGE scores](https://github.com/pltrdy/rouge) were calculated for both the fine-tuned model and the lead-3 baseline (which simply produces the leading three sentences of the document) and are presented in the following table.
|
20 |
+
|
21 |
+
| Model | Rouge-1 | Rouge-2 | Rouge-L |
|
22 |
+
| ------------- |:-------:| --------:| -------:|
|
23 |
+
| mt5-small | 0.399 | 0.318 | 0.392 |
|
24 |
+
| lead-3 | 0.343 | 0.263 | 0.341 |
|