Marian Krotil
commited on
Commit
•
fa24e9e
1
Parent(s):
c078932
Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ This model is a fine-tuned checkpoint of [facebook/mbart-large-cc25](https://hug
|
|
20 |
The model deals with the task ``Abstract + Text to Headline`` (AT2H) which consists in generating a one- or two-sentence summary considered as a headline from a Czech news text.
|
21 |
|
22 |
## Dataset
|
23 |
-
The model has been trained on the [SumeCzech](https://ufal.mff.cuni.cz/sumeczech) dataset. The dataset includes around 1M Czech news-based documents consisting of a Headline, Abstract, and Full-text sections. Truncation and padding were configured for 512 tokens.
|
24 |
|
25 |
## Training
|
26 |
The model has been trained on 1x NVIDIA Tesla A100 40GB for 40 hours. During training, the model has seen 2576K documents corresponding to roughly 3 epochs.
|
@@ -41,7 +41,7 @@ def summ_config():
|
|
41 |
("repetition_penalty", 1.2),
|
42 |
("no_repeat_ngram_size", None),
|
43 |
("early_stopping", True),
|
44 |
-
("max_length",
|
45 |
("min_length", 10),
|
46 |
])),
|
47 |
#texts to summarize
|
|
|
20 |
The model deals with the task ``Abstract + Text to Headline`` (AT2H) which consists in generating a one- or two-sentence summary considered as a headline from a Czech news text.
|
21 |
|
22 |
## Dataset
|
23 |
+
The model has been trained on the [SumeCzech](https://ufal.mff.cuni.cz/sumeczech) dataset. The dataset includes around 1M Czech news-based documents consisting of a Headline, Abstract, and Full-text sections. Truncation and padding were configured for 512 tokens for the encoder and 64 for the decoder.
|
24 |
|
25 |
## Training
|
26 |
The model has been trained on 1x NVIDIA Tesla A100 40GB for 40 hours. During training, the model has seen 2576K documents corresponding to roughly 3 epochs.
|
|
|
41 |
("repetition_penalty", 1.2),
|
42 |
("no_repeat_ngram_size", None),
|
43 |
("early_stopping", True),
|
44 |
+
("max_length", 64),
|
45 |
("min_length", 10),
|
46 |
])),
|
47 |
#texts to summarize
|