bart-ko-small / README.md
cosmoquester's picture
update: Update Model
c8b7a69 unverified
metadata
language: ko

Pretrained BART in Korean

This is pretrained BART model with multiple Korean Datasets.

I used multiple datasets for generalizing the model for both colloquial and written texts.

The training is supported by TPU Research Cloud program.

The script which is used to pre-train model is here.

When you use the reference API, you must wrap the sentence with [BOS] and [EOS] like below example.

[BOS] ์•ˆ๋…•ํ•˜์„ธ์š”? ๋ฐ˜๊ฐ€์›Œ์š”~~ [EOS]

Used Datasets

๋ชจ๋‘์˜ ๋ง๋ญ‰์น˜

  • ์ผ์ƒ ๋Œ€ํ™” ๋ง๋ญ‰์น˜ 2020
  • ๊ตฌ์–ด ๋ง๋ญ‰์น˜
  • ๋ฌธ์–ด ๋ง๋ญ‰์น˜
  • ์‹ ๋ฌธ ๋ง๋ญ‰์น˜

AIhub

์„ธ์ข… ๋ง๋ญ‰์น˜