File size: 11,759 Bytes
413de8f f33486d 8af9656 f33486d 413de8f d3f1e50 4652695 d3f1e50 e5cbd9e 413de8f e5cbd9e 413de8f e5cbd9e 413de8f f14dba2 413de8f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 |
---
pipeline_tag: summarization
language:
- ko
tags:
- T5
---
# t5-base-korean-summarization
This is [T5](https://huggingface.co/docs/transformers/model_doc/t5) model for korean text summarization.
Finetuned with 3 datasets. Specifically, it is described below.
- [Korean Paper Summarization Dataset(λ
Όλ¬Έμλ£ μμ½)](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=90)
- [Korean Book Summarization Dataset(λμμλ£ μμ½)](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=93)
- [Korean Summary statement and Report Generation Dataset(μμ½λ¬Έ λ° λ ν¬νΈ μμ± λ°μ΄ν°)](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=90)
# Usage (HuggingFace Transformers)
```python
import nltk
nltk.download('punkt')
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model = AutoModelForSeq2SeqLM.from_pretrained('eenzeenee/t5-base-korean-summarization')
tokenizer = AutoTokenizer.from_pretrained('eenzeenee/t5-base-korean-summarization')
prefix = "summarize: "
sample = """
μλ
νμΈμ? μ°λ¦¬ (2νλ
)/(μ΄ νλ
) μΉκ΅¬λ€ μ°λ¦¬ μΉκ΅¬λ€ νκ΅μ κ°μ μ§μ§ (2νλ
)/(μ΄ νλ
) μ΄ λκ³ μΆμλλ° νκ΅μ λͺ» κ°κ³ μμ΄μ λ΅λ΅νμ£ ?
κ·Έλλ μ°λ¦¬ μΉκ΅¬λ€μ μμ κ³Ό 건κ°μ΄ μ΅μ°μ μ΄λκΉμ μ€λλΆν° μ μλμ΄λ λ§€μΌ λ§€μΌ κ΅μ΄ μ¬νμ λ λ보λλ‘ ν΄μ.
μ΄/ μκ°μ΄ λ²μ¨ μ΄λ κ² λλμ? λ¦μμ΄μ. λ¦μμ΄μ. 빨리 κ΅μ΄ μ¬νμ λ λμΌ λΌμ.
κ·Έλ°λ° μ΄/ κ΅μ΄μ¬νμ λ λκΈ° μ μ μ°λ¦¬κ° μ€λΉλ¬Όμ μ±κ²¨μΌ λκ² μ£ ? κ΅μ΄ μ¬νμ λ λ μ€λΉλ¬Ό, κ΅μμ μ΄λ»κ² λ°μ μ μλμ§ μ μλμ΄ μ€λͺ
μ ν΄μ€κ²μ.
(EBS)/(μ΄λΉμμ€) μ΄λ±μ κ²μν΄μ λ€μ΄κ°λ©΄μ 첫νλ©΄μ΄ μ΄λ κ² λμμ.
μ/ κ·Έλ¬λ©΄μ μ¬κΈ° (X)/(μμ€) λλ¬μ£Ό(κ³ μ)/(ꡬμ). μ κΈ° (λκ·ΈλΌλ―Έ)/(λ₯κ·ΈλΌλ―Έ) (EBS)/(μ΄λΉμμ€) (2μ£Ό)/(μ΄ μ£Ό) λΌμ΄λΈνΉκ°μ΄λΌκ³ λμ΄μμ£ ?
κ±°κΈ°λ₯Ό λ°λ‘ κ°κΈ°λ₯Ό λλ¦
λλ€. μ/ (λλ₯΄λ©΄μ)/(λλ₯΄λ©΄μ). μ΄λ»κ² λλ? b/ λ°μΌλ‘ λ΄λ €μ λ΄λ €μ λ΄λ €μ μ λ΄λ €μ.
μ°λ¦¬ λͺ νλ
μ΄μ£ ? μ/ (2νλ
)/(μ΄ νλ
) μ΄μ£ (2νλ
)/(μ΄ νλ
)μ λ¬΄μ¨ κ³Όλͺ©? κ΅μ΄.
μ΄λ²μ£Όλ (1μ£Ό)/(μΌ μ£Ό) μ°¨λκΉμ μ¬κΈ° κ΅μ. λ€μμ£Όλ μ¬κΈ°μ λ€μ΄μ λ°μΌλ©΄ λΌμ.
μ΄ κ΅μμ ν΄λ¦μ νλ©΄, μ§μ/. μ΄λ κ² κ΅μ¬κ° λμ΅λλ€ .μ΄ κ΅μμ (λ€μ΄)/(λ°μ΄)λ°μμ μ°λ¦¬ κ΅μ΄μ¬νμ λ λ μκ° μμ΄μ.
κ·ΈλΌ μ°λ¦¬ μ§μ§λ‘ κ΅μ΄ μ¬νμ νλ² λ λ보λλ‘ ν΄μ? κ΅μ΄μ¬ν μΆλ°. μ/ (1λ¨μ)/(μΌ λ¨μ) μ λͺ©μ΄ λκ°μ? νλ² μ°Ύμλ΄μ.
μλ₯Ό μ¦κ²¨μ μμ. κ·Έλ₯ μλ₯Ό μ½μ΄μ κ° μλμμ. μλ₯Ό μ¦κ²¨μΌ λΌμ μ¦κ²¨μΌ λΌ. μ΄λ»κ² μ¦κΈΈκΉ? μΌλ¨μ λ΄λ΄ μλ₯Ό μ¦κΈ°λ λ°©λ²μ λν΄μ 곡λΆλ₯Ό ν 건λ°μ.
κ·ΈλΌ μ€λμμ μ΄λ»κ² μ¦κΈΈκΉμ? μ€λ 곡λΆν λ΄μ©μμ μλ₯Ό μ¬λ¬ κ°μ§ λ°©λ²μΌλ‘ μ½κΈ°λ₯Ό 곡λΆν κ²λλ€.
μ΄λ»κ² μ¬λ¬κ°μ§ λ°©λ²μΌλ‘ μ½μκΉ μ°λ¦¬ 곡λΆν΄ 보λλ‘ ν΄μ. μ€λμ μ λμλΌ μ§μ/! μκ° λμμ΅λλ€ μμ μ λͺ©μ΄ λκ°μ? λ€ν° λ μ΄μμ λ€ν° λ .
λꡬλ λ€νλ λμμ΄λ λ€νλ μΈλλ μΉκ΅¬λ? λꡬλ λ€νλμ§ μ μλμ΄ μλ₯Ό μ½μ΄ μ€ ν
λκΉ νλ² μκ°μ ν΄λ³΄λλ‘ ν΄μ."""
inputs = [prefix + sample]
inputs = tokenizer(inputs, max_length=512, truncation=True, return_tensors="pt")
output = model.generate(**inputs, num_beams=3, do_sample=True, min_length=10, max_length=64)
decoded_output = tokenizer.batch_decode(output, skip_special_tokens=True)[0]
result = nltk.sent_tokenize(decoded_output.strip())[0]
print('RESULT >>', result)
RESULT >> κ΅μ΄ μ¬νμ λ λκΈ° μ μ κ΅μ΄ μ¬νμ λ λ μ€λΉλ¬Όκ³Ό κ΅μμ μ΄λ»κ² λ°μ μ μλμ§ μ μλμ΄ μ€λͺ
ν΄ μ€λ€.
```
# Evalutation Result
# Training
# Model Architecture
```
T5ForConditionalGeneration(
(shared): Embedding(50358, 768)
(encoder): T5Stack(
(embed_tokens): Embedding(50358, 768)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
(relative_attention_bias): Embedding(32, 12)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=768, out_features=2048, bias=False)
(wi_1): Linear(in_features=768, out_features=2048, bias=False)
(wo): Linear(in_features=2048, out_features=768, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1~11): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=768, out_features=2048, bias=False)
(wi_1): Linear(in_features=768, out_features=2048, bias=False)
(wo): Linear(in_features=2048, out_features=768, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(decoder): T5Stack(
(embed_tokens): Embedding(50358, 768)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
(relative_attention_bias): Embedding(32, 12)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerCrossAttention(
(EncDecAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(2): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=768, out_features=2048, bias=False)
(wi_1): Linear(in_features=768, out_features=2048, bias=False)
(wo): Linear(in_features=2048, out_features=768, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1~11): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerCrossAttention(
(EncDecAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(2): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=768, out_features=2048, bias=False)
(wi_1): Linear(in_features=768, out_features=2048, bias=False)
(wo): Linear(in_features=2048, out_features=768, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(lm_head): Linear(in_features=768, out_features=50358, bias=False)
)
```
## Citation
- Raffel, Colin, et al. "Exploring the limits of transfer learning with a unified text-to-text transformer." J. Mach. Learn. Res. 21.140 (2020): 1-67.
|