File size: 2,133 Bytes
c078932
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fa24e9e
c078932
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fa24e9e
c078932
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
language:
- cs
- cs
tags:
- abstractive summarization
- mbart-cc25
- Czech
license: apache-2.0
datasets:
- SumeCzech dataset news-based
metrics:
- rouge
- rougeraw
---

# mBART fine-tuned model for Czech abstractive summarization (AT2H-S)
This model is a fine-tuned checkpoint of [facebook/mbart-large-cc25](https://huggingface.co/facebook/mbart-large-cc25) on the Czech news dataset to produce Czech abstractive summaries.
## Task
The model deals with the task ``Abstract + Text to Headline`` (AT2H) which consists in generating a one- or two-sentence summary considered as a headline from a Czech news text.

## Dataset
The model has been trained on the [SumeCzech](https://ufal.mff.cuni.cz/sumeczech) dataset. The dataset includes around 1M Czech news-based documents consisting of a Headline, Abstract, and Full-text sections. Truncation and padding were configured for 512 tokens for the encoder and 64 for the decoder.

## Training
The model has been trained on 1x NVIDIA Tesla A100 40GB for 40 hours. During training, the model has seen 2576K documents corresponding to roughly 3 epochs.

# Use
Assuming you are using the provided Summarizer.ipynb file.
```python
def summ_config():
    cfg = OrderedDict([
        # summarization model - checkpoint from website
        ("model_name", "krotima1/mbart-at2h-s"),
        ("inference_cfg", OrderedDict([
            ("num_beams", 4),
            ("top_k", 40),
            ("top_p", 0.92),
            ("do_sample", True),
            ("temperature", 0.89),
            ("repetition_penalty", 1.2),
            ("no_repeat_ngram_size", None),
            ("early_stopping", True),
            ("max_length", 64),
            ("min_length", 10),
        ])),
        #texts to summarize
        ("text",
            [
                "Input your Czech text",
            ]
        ),
    ])
    return cfg
cfg = summ_config()
#load model    
model = AutoModelForSeq2SeqLM.from_pretrained(cfg["model_name"])
tokenizer = AutoTokenizer.from_pretrained(cfg["model_name"])
# init summarizer
summarize = Summarizer(model, tokenizer, cfg["inference_cfg"])
summarize(cfg["text"])
```