Sehong's picture
Update README.md
09f94b7
metadata
language: ko
tags:
  - bart
datasets:
  - korquad
license: mit

Korean Question Generation Model

Github

https://github.com/Seoneun/KoBART-Question-Generation

Fine-tuning Dataset

KorQuAD 1.0

Demo

https://huggingface.co/Sehong/kobart-QuestionGeneration

How to use

import torch
from transformers import PreTrainedTokenizerFast
from transformers import BartForConditionalGeneration

tokenizer = PreTrainedTokenizerFast.from_pretrained('Sehong/kobart-QuestionGeneration')
model = BartForConditionalGeneration.from_pretrained('Sehong/kobart-QuestionGeneration')

text = "1989λ…„ 2μ›” 15일 μ—¬μ˜λ„ 농민 폭λ ₯ μ‹œμœ„λ₯Ό μ£Όλ„ν•œ 혐의(폭λ ₯ν–‰μœ„λ“±μ²˜λ²Œμ—κ΄€ν•œλ²•λ₯ μœ„λ°˜)으둜 지λͺ…μˆ˜λ°°λ˜μ—ˆλ‹€. 1989λ…„ 3μ›” 12일 μ„œμšΈμ§€λ°©κ²€μ°°μ²­ κ³΅μ•ˆλΆ€λŠ” μž„μ’…μ„μ˜ μ‚¬μ „κ΅¬μ†μ˜μž₯을 λ°œλΆ€λ°›μ•˜λ‹€. 같은 ν•΄ 6μ›” 30일 평양좕전에 μž„μˆ˜κ²½μ„ λŒ€ν‘œλ‘œ νŒŒκ²¬ν•˜μ—¬ κ΅­κ°€λ³΄μ•ˆλ²•μœ„λ°˜ ν˜μ˜κ°€ μΆ”κ°€λ˜μ—ˆλ‹€. 경찰은 12μ›” 18일~20일 사이 μ„œμšΈ κ²½ν¬λŒ€ν•™κ΅μ—μ„œ μž„μ’…μ„μ΄ μ„±λͺ… λ°œν‘œλ₯Ό μΆ”μ§„ν•˜κ³  μžˆλ‹€λŠ” 첩보λ₯Ό μž…μˆ˜ν–ˆκ³ , 12μ›” 18일 μ˜€μ „ 7μ‹œ 40λΆ„ κ²½ κ°€μŠ€μ΄κ³Ό μ „μžλ΄‰μœΌλ‘œ 무μž₯ν•œ 특곡쑰 및 λŒ€κ³΅κ³Ό 직원 12λͺ… λ“± 22λͺ…μ˜ 사볡 경찰을 승용차 8λŒ€μ— λ‚˜λˆ„μ–΄ κ²½ν¬λŒ€ν•™κ΅μ— νˆ¬μž…ν–ˆλ‹€. 1989λ…„ 12μ›” 18일 μ˜€μ „ 8μ‹œ 15λΆ„ κ²½ μ„œμšΈμ²­λŸ‰λ¦¬κ²½μ°°μ„œλŠ” ν˜Έμœ„ 학생 5λͺ…κ³Ό ν•¨κ»˜ κ²½ν¬λŒ€ν•™κ΅ ν•™μƒνšŒκ΄€ 건물 계단을 λ‚΄λ €μ˜€λŠ” μž„μ’…μ„μ„ 발견, κ²€κ±°ν•΄ ꡬ속을 μ§‘ν–‰ν–ˆλ‹€. μž„μ’…μ„μ€ μ²­λŸ‰λ¦¬κ²½μ°°μ„œμ—μ„œ μ•½ 1μ‹œκ°„ λ™μ•ˆ 쑰사λ₯Ό 받은 λ’€ μ˜€μ „ 9μ‹œ 50λΆ„ κ²½ μ„œμšΈ μž₯μ•ˆλ™μ˜ μ„œμšΈμ§€λ°©κ²½μ°°μ²­ κ³΅μ•ˆλΆ„μ‹€λ‘œ μΈκ³„λ˜μ—ˆλ‹€. <unused0> 1989λ…„ 2μ›” 15일"

raw_input_ids = tokenizer.encode(text)
input_ids = [tokenizer.bos_token_id] + raw_input_ids + [tokenizer.eos_token_id]

summary_ids = model.generate(torch.tensor([input_ids]))
print(tokenizer.decode(summary_ids.squeeze().tolist(), skip_special_tokens=True))

# <unused0> is sep_token, sep_token seperate content and answer