File size: 4,142 Bytes
0f113b6
46fbce2
0f113b6
 
 
 
7e3ff43
0f113b6
1e68067
0f113b6
8b065d9
0f113b6
8b065d9
 
 
 
 
 
 
 
 
 
 
 
 
 
81307c8
8b065d9
 
81307c8
8b065d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
da2e256
 
 
8b065d9
 
da2e256
 
 
 
 
 
 
 
 
 
 
7e3ff43
da2e256
7e3ff43
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
language: en
tags:
- summarization
license: apache-2.0
datasets:
- cnn_dailymail
- xsum
thumbnail: https://huggingface.co/front/thumbnails/distilbart_medium.png
---
# Distilbart-cnn-12-6

## Table of Contents
- [Model Details](#model-details)
- [How to Get Started With the Model](#how-to-get-started-with-the-model)
- [Uses](#uses)
- [Risks, Limitations and Biases](#risks-limitations-and-biases)
- [Training](#training)
- [Evaluation](#evaluation)

## Model Details
- **Model Description:**
- **Developed by:** Sam Shleifer
- **Model Type:** Summarization
- **Language(s):** English
- **License:** Apache-2.0
- **Parent Model:** See the [BART large CNN model](https://huggingface.co/facebook/bart-large-cnn) for more information about the BART large-sized model which is similarly trained on [CNN Dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset.
- **Resources for more information:**
  - [Bart Document](https://huggingface.co/docs/transformers/model_doc/bart#transformers.BartForConditionalGeneration)
  - [BART large CNN model paper](https://arxiv.org/abs/1910.13461)


## How to Get Started With the Model
```python
​​from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("sshleifer/distilbart-cnn-12-6")

model = AutoModelForSeq2SeqLM.from_pretrained("sshleifer/distilbart-cnn-12-6")

```


## Uses


#### Direct Use
This model can be used for text summerzation.


## Risks, Limitations and Biases
### Limitations
This model makes use of the [CNN Dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset, which is an English-language dataset containing just over 300k unique news articles as written by journalists at CNN and the Daily Mail.
The BCP-47 code for English as generally spoken in the United States is en-US and the BCP-47 code for English as generally spoken in the United Kingdom is en-GB. It is unknown if other varieties of English are represented in the data.

### Biases
[Bordia and Bowman (2019)](https://www.aclweb.org/anthology/N19-3002.pdf) explore measuring gender bias and debiasing techniques in the CNN / Dailymail dataset, the Penn Treebank, and WikiText-2. They find the CNN / Dailymail dataset to have a slightly lower gender bias based on their metric compared to the other datasets, but still show evidence of gender bias when looking at words such as 'fragile'.

Further information e.g in regards to uses, out-of-scope uses, training procedure for the CNN Dailymail dataset are available within its [dataset card](https://huggingface.co/datasets/cnn_dailymail).


## Training

This checkpoint should be loaded into `BartForConditionalGeneration.from_pretrained`. See the [BART docs](https://huggingface.co/transformers/model_doc/bart.html?#transformers.BartForConditionalGeneration) for more information.

## Evaluation 

### Metrics for DistilBART models

| Model Name                 |   MM Params |   Inference Time (MS) |   Speedup |   Rouge 2 |   Rouge-L |
|:---------------------------|------------:|----------------------:|----------:|----------:|----------:|
| distilbart-xsum-12-1       |         222 |                    90 |      2.54 |     18.31 |     33.37 |
| distilbart-xsum-6-6        |         230 |                   132 |      1.73 |     20.92 |     35.73 |
| distilbart-xsum-12-3       |         255 |                   106 |      2.16 |     21.37 |     36.39 |
| distilbart-xsum-9-6        |         268 |                   136 |      1.68 |     21.72 |     36.61 |
| bart-large-xsum (baseline) |         406 |                   229 |      1    |     21.85 |     36.50 |
| distilbart-xsum-12-6       |         306 |                   137 |      1.68 |     22.12 |     36.99 |
| bart-large-cnn (baseline)  |         406 |                   381 |      1    |     21.06 |     30.63 |
| distilbart-12-3-cnn        |         255 |                   214 |      1.78 |     20.57 |     30.00 |
| distilbart-12-6-cnn        |         306 |                   307 |      1.24 |     21.26 |     30.59 |
| distilbart-6-6-cnn         |         230 |                   182 |      2.09 |     20.17 |     29.70 |