tuanle commited on
Commit
47aec07
1 Parent(s): 6d0452f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -2
README.md CHANGED
@@ -19,6 +19,9 @@ metrics:
19
  ## Model description
20
  A Fine-tuned Vietnamese GPT2 model which can generate Vietnamese news based on context (category + headline), based on the Vietnamese Wiki GPT2 pretrained model (https://huggingface.co/danghuy1999/gpt2-viwiki)
21
 
 
 
 
22
  ## Purpose
23
  This model was made only for fun and experimental study. However, It gives impressive results
24
  Most of the generative news are fake with unconfirmed information. Honestly, I feel fun about this project =))
@@ -38,5 +41,33 @@ The dataset is about 30k Vietnamese news dataset from website thanhnien.vn
38
  - You can choose any categories and give it some text for the headline, then generate. There we go
39
  - P/s: I've already tried to deploy my model on Streamlit's cloud, but It was always being broken due to out of memory
40
 
41
- ## Github
42
- - https://github.com/Tuan-Lee-23/Vietnamese-News-Generative-Model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ## Model description
20
  A Fine-tuned Vietnamese GPT2 model which can generate Vietnamese news based on context (category + headline), based on the Vietnamese Wiki GPT2 pretrained model (https://huggingface.co/danghuy1999/gpt2-viwiki)
21
 
22
+ ## Github
23
+ - https://github.com/Tuan-Lee-23/Vietnamese-News-Generative-Model
24
+
25
  ## Purpose
26
  This model was made only for fun and experimental study. However, It gives impressive results
27
  Most of the generative news are fake with unconfirmed information. Honestly, I feel fun about this project =))
 
41
  - You can choose any categories and give it some text for the headline, then generate. There we go
42
  - P/s: I've already tried to deploy my model on Streamlit's cloud, but It was always being broken due to out of memory
43
 
44
+
45
+ ## Usage (Huggingface)
46
+ ```
47
+ import torch
48
+ from transformers import AutoTokenizer, AutoModelForCausalLM
49
+
50
+ device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
51
+ text = f"<|startoftext|> {category} <|headline|> {headline}"
52
+
53
+ tokenizer = AutoTokenizer.from_pretrained("tuanle/VN-News-GPT2")
54
+ model= AutoModelForCausalLM.from_pretrained("tuanle/VN-News-GPT2").to(device)
55
+
56
+ input_ids = tokenizer.encode(text, return_tensors='pt').to(device)
57
+ sample_outputs = model.generate(input_ids,
58
+ do_sample=True,
59
+ max_length=max_len,
60
+ min_length=min_len,
61
+ # temperature = .8,
62
+ top_k= top_k,
63
+ top_p = top_p,
64
+ num_beams= num_beams,
65
+ early_stopping= True,
66
+ no_repeat_ngram_size= 2 ,
67
+ num_return_sequences= num_return_sequences)
68
+
69
+ for i, sample_output in enumerate(sample_outputs):
70
+ temp = tokenizer.decode(sample_output.tolist())
71
+ print(f">> Generated text {i+1}\n\n{temp}")
72
+ print('\n---')
73
+ ```