Text Generation
Transformers
Safetensors
French
English
llama
legal
code
text-generation-inference
art
conversational
Inference Endpoints
manu commited on
Commit
9cef06a
1 Parent(s): 104db24

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -5
README.md CHANGED
@@ -27,7 +27,17 @@ https://arxiv.org/abs/2402.00786
27
  For best performance, it should be used with a temperature of above 0.4, and with the exact template described below:
28
 
29
  ```python
30
- CHAT = """<|im_start|>user
 
 
 
 
 
 
 
 
 
 
31
  {USER QUERY}<|im_end|>
32
  <|im_start|>assistant\n"""
33
  ```
@@ -68,11 +78,13 @@ model_name = "croissantllm/CroissantLLMChat-v0.1"
68
  tokenizer = AutoTokenizer.from_pretrained(model_name)
69
  model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")
70
 
71
- CHAT = """<|im_start|>user
72
- Que puis-je faire à Marseille?<|im_end|>
73
- <|im_start|>assistant\n"""
 
 
74
 
75
- inputs = tokenizer(CHAT, return_tensors="pt", add_special_tokens=True).to(model.device)
76
  tokens = model.generate(**inputs, max_new_tokens=150, do_sample=True, top_p=0.95, top_k=60, temperature=0.5)
77
  print(tokenizer.decode(tokens[0]))
78
  ```
 
27
  For best performance, it should be used with a temperature of above 0.4, and with the exact template described below:
28
 
29
  ```python
30
+ chat = [
31
+ {"role": "user", "content": "Que puis-je faire à Marseille en hiver?"},
32
+ ]
33
+
34
+ chat_input = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
35
+ ```
36
+
37
+ corresponding to:
38
+
39
+ ```python
40
+ chat_input = """<|im_start|>user
41
  {USER QUERY}<|im_end|>
42
  <|im_start|>assistant\n"""
43
  ```
 
78
  tokenizer = AutoTokenizer.from_pretrained(model_name)
79
  model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")
80
 
81
+ chat = [
82
+ {"role": "user", "content": "Que puis-je faire à Marseille en hiver?"},
83
+ ]
84
+
85
+ chat_input = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
86
 
87
+ inputs = tokenizer(chat_input, return_tensors="pt", add_special_tokens=True).to(model.device)
88
  tokens = model.generate(**inputs, max_new_tokens=150, do_sample=True, top_p=0.95, top_k=60, temperature=0.5)
89
  print(tokenizer.decode(tokens[0]))
90
  ```