benchang1110 commited on
Commit
f09436a
1 Parent(s): 543ede3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -1
README.md CHANGED
@@ -14,4 +14,39 @@ widget:
14
 
15
  # Model Card for Model ID
16
 
17
- This is a continue-pretrained version of [Tinyllama](TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) tailored for traditional Chinese. The continue-pretraining dataset contains roughly 2B tokens.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  # Model Card for Model ID
16
 
17
+ This is a continue-pretrained version of [Tinyllama](TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) tailored for traditional Chinese. The continue-pretraining dataset contains roughly 2B tokens.
18
+
19
+ # Usage
20
+ ```python
21
+ from transformers import AutoModelForCausalLM, AutoTokenizer
22
+ import torch
23
+
24
+ def generate_response(input):
25
+ '''
26
+ simple test for the model
27
+ '''
28
+ # tokenzize the input
29
+ tokenized_input = tokenizer.encode_plus(input, return_tensors='pt').to(device)
30
+
31
+ # generate the response
32
+ outputs = model.generate(
33
+ input_ids=tokenized_input['input_ids'],
34
+ attention_mask=tokenized_input['attention_mask'],
35
+ pad_token_id=tokenizer.pad_token_id,
36
+ do_sample=False,
37
+ repetition_penalty=1.3,
38
+ max_length=500
39
+ )
40
+
41
+ # decode the response
42
+ return tokenizer.decode(outputs[0], skip_special_tokens=True)
43
+
44
+ if __name__ == '__main__':
45
+ device = 'cuda' if torch.cuda.is_available() else 'cpu'
46
+ model = AutoModelForCausalLM.from_pretrained("benchang1110/Taiwan-tinyllama-v1.0-base",device_map=device,torch_dtype=torch.bfloat16)
47
+ tokenizer = AutoTokenizer.from_pretrained("benchang1110/Taiwan-tinyllama-v1.0-base")
48
+ while(True):
49
+ text = input("input a simple prompt:")
50
+ print('System:', generate_response(text))
51
+ ```
52
+ Using bfloat16, the VRAM required is around 3GB!!!