Khoa commited on
Commit
66405bb
1 Parent(s): ec073e6
Files changed (1) hide show
  1. README.md +59 -2
README.md CHANGED
@@ -1,3 +1,60 @@
1
  ---
2
- pipeline_tag: text-generation
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: vi
3
+ tags:
4
+ - vi
5
+ - vietnamese
6
+ - gpt2
7
+ - text-generation
8
+ - lm
9
+ - nlp
10
+ datasets:
11
+ - VN-Literature
12
+ widget:
13
+ - text: >-
14
+ Hôm ấy, cụ Bá ông quả quyết mở ví tiền để trả cho anh lái chó cái giấy bạc
15
+ một đồng.
16
+ ---
17
+ inference:
18
+ parameters:
19
+ max_length: 500
20
+ do_sample: True
21
+ temperature: 0.8
22
+
23
+ # GPT-2
24
+
25
+ The GPT2 model is pre-trained on the writing style of Vu Trong Phung
26
+
27
+ # How to use the model
28
+
29
+ ~~~~
30
+ from transformers import GPT2Tokenizer, GPT2LMHeadModel
31
+
32
+ tokenizer = GPT2Tokenizer.from_pretrained("Khoa/VN-Literature-Generation")
33
+ model = GPT2LMHeadModel.from_pretrained("Khoa/VN-Literature-Generation")
34
+
35
+
36
+ text = "Mùa thu lá vàng rơi"
37
+ input_ids = tokenizer.encode(text, return_tensors='pt')
38
+ max_length = 300
39
+ model.to('cpu')
40
+ sample_outputs = model.generate(input_ids,pad_token_id=tokenizer.eos_token_id,
41
+ do_sample=True,
42
+ max_length=max_length,
43
+ min_length=max_length,
44
+ top_k=40,
45
+ num_beams=5,
46
+ early_stopping=True,
47
+ no_repeat_ngram_size=2,
48
+ num_return_sequences=3)
49
+
50
+ for i, sample_output in enumerate(sample_outputs):
51
+ print(">> Generated text {}\n\n{}".format(i+1, tokenizer.decode(sample_output.tolist())))
52
+ print('\n---')
53
+
54
+ ~~~~
55
+
56
+
57
+ ## Author
58
+ `
59
+ Dong Dang Khoa
60
+ `