tuanle
/

VN-News-GPT2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

tuanle commited on Feb 26, 2022

Commit

47aec07

•

1 Parent(s): 6d0452f

Update README.md

Files changed (1) hide show

README.md +33 -2

README.md CHANGED Viewed

@@ -19,6 +19,9 @@ metrics:
 ## Model description
 A Fine-tuned Vietnamese GPT2 model which can generate Vietnamese news based on context (category + headline), based on the Vietnamese Wiki GPT2 pretrained model (https://huggingface.co/danghuy1999/gpt2-viwiki)
 ## Purpose
 This model was made only for fun and experimental study. However, It gives impressive results
 Most of the generative news are fake with unconfirmed information. Honestly, I feel fun about this project =))
@@ -38,5 +41,33 @@ The dataset is about 30k Vietnamese news dataset from website thanhnien.vn
 - You can choose any categories and give it some text for the headline, then generate. There we go
 - P/s: I've already tried to deploy my model on Streamlit's cloud, but It was always being broken due to out of memory
-## Github
-- https://github.com/Tuan-Lee-23/Vietnamese-News-Generative-Model

 ## Model description
 A Fine-tuned Vietnamese GPT2 model which can generate Vietnamese news based on context (category + headline), based on the Vietnamese Wiki GPT2 pretrained model (https://huggingface.co/danghuy1999/gpt2-viwiki)
+## Github
+- https://github.com/Tuan-Lee-23/Vietnamese-News-Generative-Model
 ## Purpose
 This model was made only for fun and experimental study. However, It gives impressive results
 Most of the generative news are fake with unconfirmed information. Honestly, I feel fun about this project =))
 - You can choose any categories and give it some text for the headline, then generate. There we go
 - P/s: I've already tried to deploy my model on Streamlit's cloud, but It was always being broken due to out of memory
+## Usage (Huggingface)
+```
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
+text = f"<|startoftext|> {category} <|headline|> {headline}"
+tokenizer = AutoTokenizer.from_pretrained("tuanle/VN-News-GPT2")
+model= AutoModelForCausalLM.from_pretrained("tuanle/VN-News-GPT2").to(device)
+input_ids = tokenizer.encode(text, return_tensors='pt').to(device)
+sample_outputs = model.generate(input_ids,
+                                do_sample=True,
+                                max_length=max_len,
+                                min_length=min_len,
+                                #    temperature = .8,
+                                top_k= top_k,
+                                top_p = top_p,
+                                num_beams= num_beams,
+                                early_stopping= True,
+                                no_repeat_ngram_size= 2  ,
+                                num_return_sequences= num_return_sequences)
+for i, sample_output in enumerate(sample_outputs):
+    temp = tokenizer.decode(sample_output.tolist())
+    print(f">> Generated text {i+1}\n\n{temp}")
+    print('\n---')
+```