gogamza's picture
Update README.md
8a63d69
metadata
language: ko
tags:
  - bart
license: mit

Korean News Summarization Model

Demo

https://huggingface.co/spaces/gogamza/kobart-summarization

How to use

import torch
from transformers import PreTrainedTokenizerFast
from transformers import BartForConditionalGeneration

tokenizer = PreTrainedTokenizerFast.from_pretrained('gogamza/kobart-summarization')
model = BartForConditionalGeneration.from_pretrained('gogamza/kobart-summarization')

text = "๊ณผ๊ฑฐ๋ฅผ ๋– ์˜ฌ๋ ค๋ณด์ž. ๋ฐฉ์†ก์„ ๋ณด๋˜ ์šฐ๋ฆฌ์˜ ๋ชจ์Šต์„. ๋…๋ณด์ ์ธ ๋งค์ฒด๋Š” TV์˜€๋‹ค. ์˜จ ๊ฐ€์กฑ์ด ๋‘˜๋Ÿฌ์•‰์•„ TV๋ฅผ ๋ดค๋‹ค. ๊ฐ„ํ˜น ๊ฐ€์กฑ๋“ค๋ผ๋ฆฌ ๋‰ด์Šค์™€ ๋“œ๋ผ๋งˆ, ์˜ˆ๋Šฅ ํ”„๋กœ๊ทธ๋žจ์„ ๋‘˜๋Ÿฌ์‹ธ๊ณ  ๋ฆฌ๋ชจ์ปจ ์Ÿํƒˆ์ „์ด ๋ฒŒ์–ด์ง€๊ธฐ๋„  ํ–ˆ๋‹ค. ๊ฐ์ž ์„ ํ˜ธํ•˜๋Š” ํ”„๋กœ๊ทธ๋žจ์„ โ€˜๋ณธ๋ฐฉโ€™์œผ๋กœ ๋ณด๊ธฐ ์œ„ํ•œ ์‹ธ์›€์ด์—ˆ๋‹ค. TV๊ฐ€ ํ•œ ๋Œ€์ธ์ง€ ๋‘ ๋Œ€์ธ์ง€ ์—ฌ๋ถ€๋„ ๊ทธ๋ž˜์„œ ์ค‘์š”ํ–ˆ๋‹ค. ์ง€๊ธˆ์€ ์–ด๋–ค๊ฐ€. โ€˜์•ˆ๋ฐฉ๊ทน์žฅโ€™์ด๋ผ๋Š” ๋ง์€ ์˜›๋ง์ด ๋๋‹ค. TV๊ฐ€ ์—†๋Š” ์ง‘๋„ ๋งŽ๋‹ค. ๋ฏธ๋””์–ด์˜ ํ˜œ ํƒ์„ ๋ˆ„๋ฆด ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์€ ๋Š˜์–ด๋‚ฌ๋‹ค. ๊ฐ์ž์˜ ๋ฐฉ์—์„œ ๊ฐ์ž์˜ ํœด๋Œ€ํฐ์œผ๋กœ, ๋…ธํŠธ๋ถ์œผ๋กœ, ํƒœ๋ธ”๋ฆฟ์œผ๋กœ ์ฝ˜ํ…์ธ  ๋ฅผ ์ฆ๊ธด๋‹ค."

raw_input_ids = tokenizer.encode(text)
input_ids = [tokenizer.bos_token_id] + raw_input_ids + [tokenizer.eos_token_id]

summary_ids = model.generate(torch.tensor([input_ids]))
tokenizer.decode(summary_ids.squeeze().tolist(), skip_special_tokens=True)