andreaskoepf commited on
Commit
ced264d
1 Parent(s): 3d4a919

add usage example written by Jordi

Browse files
Files changed (1) hide show
  1. README.md +21 -0
README.md CHANGED
@@ -17,11 +17,32 @@ widget:
17
  ---
18
  # llama2-13b-orca-8k-3319
19
 
 
 
20
  This model is a fine-tuning of Meta's Llama2 13B model with 8K context size on a long-conversation variant of the Dolphin dataset ([orca-chat](https://huggingface.co/datasets/shahules786/orca-chat)).
21
 
22
  Note: **At least Huggingface Transformers [4.31.0](https://pypi.org/project/transformers/4.31.0/) is required to load this model!**
23
 
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  - base model: [meta-llama/Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b)
26
  - License: [Llama 2 Community License Agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
27
  - sampling report: TBD
 
17
  ---
18
  # llama2-13b-orca-8k-3319
19
 
20
+ ## Model Description
21
+
22
  This model is a fine-tuning of Meta's Llama2 13B model with 8K context size on a long-conversation variant of the Dolphin dataset ([orca-chat](https://huggingface.co/datasets/shahules786/orca-chat)).
23
 
24
  Note: **At least Huggingface Transformers [4.31.0](https://pypi.org/project/transformers/4.31.0/) is required to load this model!**
25
 
26
 
27
+ ## Usage
28
+
29
+ ```python
30
+ import torch
31
+ from transformers import AutoModelForCausalLM, AutoTokenizer
32
+
33
+ tokenizer = AutoTokenizer.from_pretrained("OpenAssistant/llama2-13b-orca-8k-3319", use_fast=False)
34
+ model = AutoModelForCausalLM.from_pretrained("OpenAssistant/llama2-13b-orca-8k-3319", torch_dtype=torch.float16, low_cpu_mem_usage=True, device_map="auto")
35
+
36
+ system_message = "You are an AI assistant. Provide a detailed answer so user don’t need to search outside to understand the answer."
37
+ user_prompt = "Write me a poem please"
38
+ prompt = f"""<|system|>{system_message}</s><|prompter|>{user_prompt}</s><|assistant|>"""
39
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
40
+ output = model.generate(**inputs, do_sample=True, top_p=0.95, top_k=0, max_new_tokens=256)
41
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
42
+ ```
43
+
44
+ ## Model Details
45
+
46
  - base model: [meta-llama/Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b)
47
  - License: [Llama 2 Community License Agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
48
  - sampling report: TBD