File size: 1,490 Bytes
cd5297f
66405bb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
language: vi
tags:
- vi
- vietnamese
- gpt2
- text-generation
- lm
- nlp
datasets:
- VN-Literature
widget:
- text: >-
    Hôm ấy, cụ Bá ông quả quyết mở ví tiền để trả cho anh lái chó cái giấy bạc
    một đồng.
---
inference:
  parameters:
    max_length: 500
    do_sample: True
    temperature: 0.8

# GPT-2

The GPT2 model is pre-trained on the writing style of Vu Trong Phung

# How to use the model

~~~~
from transformers import GPT2Tokenizer, GPT2LMHeadModel

tokenizer = GPT2Tokenizer.from_pretrained("Khoa/VN-Literature-Generation")
model = GPT2LMHeadModel.from_pretrained("Khoa/VN-Literature-Generation")


text = "Mùa thu lá vàng rơi"
input_ids = tokenizer.encode(text, return_tensors='pt')
max_length = 300
model.to('cpu')
sample_outputs = model.generate(input_ids,pad_token_id=tokenizer.eos_token_id,
                                   do_sample=True,
                                   max_length=max_length,
                                   min_length=max_length,
                                   top_k=40,
                                   num_beams=5,
                                   early_stopping=True,
                                   no_repeat_ngram_size=2,
                                   num_return_sequences=3)

for i, sample_output in enumerate(sample_outputs):
    print(">> Generated text {}\n\n{}".format(i+1, tokenizer.decode(sample_output.tolist())))
    print('\n---')

~~~~


## Author
`
Dong Dang Khoa
`