duyntnet commited on
Commit
d42e1b4
1 Parent(s): 7773878

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ inference: false
7
+ tags:
8
+ - transformers
9
+ - gguf
10
+ - imatrix
11
+ - openchat-3.6-8b-20240522
12
+ ---
13
+ Quantizations of https://huggingface.co/openchat/openchat-3.6-8b-20240522
14
+
15
+
16
+ # From original readme
17
+
18
+ ### Conversation templates
19
+
20
+ 💡 **Default Mode**: Best for coding, chat and general tasks
21
+
22
+ ```
23
+ GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant: Hi<|end_of_turn|>GPT4 Correct User: How are you today?<|end_of_turn|>GPT4 Correct Assistant:
24
+ ```
25
+
26
+ ⚠️ **Notice:** Remember to set `<|end_of_turn|>` as end of generation token.
27
+
28
+ The default template is also available as the integrated `tokenizer.chat_template`, which can be used instead of manually specifying the template:
29
+
30
+ ```python
31
+ messages = [
32
+ {"role": "user", "content": "Hello"},
33
+ {"role": "assistant", "content": "Hi"},
34
+ {"role": "user", "content": "How are you today?"}
35
+ ]
36
+ tokens = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
37
+ ```
38
+
39
+ ### Inference using Transformers
40
+
41
+ ```python
42
+ from transformers import AutoTokenizer, AutoModelForCausalLM
43
+ import torch
44
+
45
+ model_id = "openchat/openchat-3.6-8b-20240522"
46
+
47
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
48
+ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")
49
+
50
+ messages = [
51
+ {"role": "user", "content": "Explain how large language models work in detail."},
52
+ ]
53
+ input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
54
+
55
+ outputs = model.generate(input_ids,
56
+ do_sample=True,
57
+ temperature=0.5,
58
+ max_new_tokens=1024
59
+ )
60
+ response = outputs[0][input_ids.shape[-1]:]
61
+ print(tokenizer.decode(response, skip_special_tokens=True))
62
+ ```