π KoChatBART
BART(Bidirectional and Auto-Regressive Transformers)λ μ
λ ₯ ν
μ€νΈ μΌλΆμ λ
Έμ΄μ¦λ₯Ό μΆκ°νμ¬ μ΄λ₯Ό λ€μ μλ¬ΈμΌλ‘ 볡ꡬνλ autoencoder
μ ννλ‘ νμ΅μ΄ λ©λλ€. νκ΅μ΄ μ±ν
BART(μ΄ν KoChatBART) λ λ
Όλ¬Έμμ μ¬μ©λ Text Infilling
λ
Έμ΄μ¦ ν¨μλ₯Ό μ¬μ©νμ¬ μ½ 10GB μ΄μμ νκ΅μ΄ λν ν
μ€νΈμ λν΄μ νμ΅ν νκ΅μ΄ encoder-decoder
μΈμ΄ λͺ¨λΈμ
λλ€. μ΄λ₯Ό ν΅ν΄ λμΆλ λν μμ±μ κ°κ±΄ν KoChatBART-base
λ₯Ό λ°°ν¬ν©λλ€.
Quick tour
from transformers import AutoTokenizer, BartForConditionalGeneration
tokenizer = AutoTokenizer.from_pretrained("BM-K/KoChatBART")
model = BartForConditionalGeneration.from_pretrained("BM-K/KoChatBART")
inputs = tokenizer("μλ
μΈμμ!", return_tensors="pt")
outputs = model(**inputs)
μ¬μ νμ΅ λ°μ΄ν° μ μ²λ¦¬
μ¬μ©ν λ°μ΄ν°μ
- μ£Όμ λ³ ν μ€νΈ μΌμ λν λ°μ΄ν°
- μμκ³΅μΈ κ³ κ° μ£Όλ¬Έ μ§μ-μλ΅ ν μ€νΈ
- νκ΅μ΄ SNS
- λ―Όμ μ 무 μλν μΈκ³΅μ§λ₯ μΈμ΄ λ°μ΄ν°
KoChatBARTλ₯Ό νμ΅μν€κΈ° μνμ¬ νκ΅μ΄ λν λ°μ΄ν°μ λ€μ μ μ²λ¦¬ ν ν©μ³ λλμ νκ΅μ΄ λν λ§λμΉλ₯Ό λ§λ€μμ΅λλ€.
- λ°μ΄ν°μ μ€λ³΅μ μ€μ΄κΈ° μν΄ 'γ γ γ γ γ γ 'μ κ°μ μ€λ³΅λ ννμ΄ 2λ² μ΄μ λ°λ³΅λ λλ 'γ γ 'μ κ°μ΄ 2λ²μΌλ‘ λ°κΏ¨μ΅λλ€.
- λ무 짧μ λ°μ΄ν°λ νμ΅μ λ°©ν΄κ° λ μ μκΈ° λλ¬Έμ KoBART ν ν¬λμ΄μ κΈ°μ€ μ 체 ν ν° κΈΈμ΄κ° 3μ λλ λ°μ΄ν°λ§μ μ λ³νμ΅λλ€.
- κ°λͺ μ²λ¦¬λ λ°μ΄ν°λ μ κ±°νμμ΅λλ€.
Model
Model | # of params | vocab size | Type | # of layers | # of heads | ffn_dim | hidden_dims |
---|---|---|---|---|---|---|---|
KoChatBART |
139M | 50265 | Encoder | 6 | 16 | 3072 | 768 |
Decoder | 6 | 16 | 3072 | 768 |
λν μμ± μ±λ₯ μΈ‘μ
λ€μ μ½λ(Dialogue Generator)λ₯Ό κΈ°λ°μΌλ‘ κ° λͺ¨λΈμ fine-tuning νμμ΅λλ€. λν μμ± μ±λ₯ μΈ‘μ μ μν΄ μΆλ‘ μ ν ν¬λμ΄μ§λμ΄ μμ±λ μλ΅μ 볡μν ν, BPE tokenizerλ₯Ό μ¬μ©νμ¬ μ€μ μλ΅κ³Ό μμ±λ μλ΅ μ¬μ΄μ overlap λ° distinctλ₯Ό μΈ‘μ νμμ΅λλ€.
Warning
μΌλ°μ μΌλ‘ 짧μ λν λ°μ΄ν°λ‘ λͺ¨λΈμ μ¬μ νμ΅νμκΈ° λλ¬Έμ κΈ΄ λ¬Έμ₯ μ²λ¦¬κ° μꡬλλ νμ€ν¬(μμ½) λ±μ λν΄μλ μ½ν λͺ¨μ΅μ 보μ λλ€.
μ€ν κ²°κ³Ό
Training | Validation | Test |
---|---|---|
9,458 | 1,182 | 1,183 |
Model | Param | BLEU-3 | BLEU-4 | Dist-1 | Dist-2 |
---|---|---|---|---|---|
KoBART | 124M | 8.73 | 7.12 | 16.85 | 34.89 |
KoChatBART | 139M | 12.97 | 11.23 | 19.64 | 44.53 |
KoT5-ETRI | 324M | 12.10 | 10.14 | 16.97 | 40.09 |
Training | Validation | Test |
---|---|---|
29,093 | 1,616 | 1,616 |
Model | Param | BLEU-3 | BLEU-4 | Dist-1 | Dist-2 |
---|---|---|---|---|---|
KoBART | 124M | 10.04 | 7.24 | 13.76 | 42.09 |
KoChatBART | 139M | 10.11 | 7.26 | 15.12 | 46.08 |
KoT5-ETRI | 324M | 9.45 | 6.66 | 14.50 | 45.46 |