Suparious commited on
Commit
05330b2
1 Parent(s): 908e707

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -2
README.md CHANGED
@@ -1,13 +1,82 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
2
  inference: false
 
3
  ---
4
  # Gryphe/Tiamat-8b-1.2-Llama-3-DPO AWQ
5
 
6
- ** PROCESSING .... ETA 30mins **
7
-
8
  - Model creator: [Gryphe](https://huggingface.co/Gryphe)
9
  - Original model: [Tiamat-8b-1.2-Llama-3-DPO](https://huggingface.co/Gryphe/Tiamat-8b-1.2-Llama-3-DPO)
10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ### About AWQ
12
 
13
  AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
 
1
  ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ language:
5
+ - en
6
+ tags:
7
+ - 4-bit
8
+ - AWQ
9
+ - text-generation
10
+ - autotrain_compatible
11
+ - endpoints_compatible
12
+ pipeline_tag: text-generation
13
  inference: false
14
+ quantized_by: Suparious
15
  ---
16
  # Gryphe/Tiamat-8b-1.2-Llama-3-DPO AWQ
17
 
 
 
18
  - Model creator: [Gryphe](https://huggingface.co/Gryphe)
19
  - Original model: [Tiamat-8b-1.2-Llama-3-DPO](https://huggingface.co/Gryphe/Tiamat-8b-1.2-Llama-3-DPO)
20
 
21
+ ![image/png](Tiamat.png)
22
+
23
+ ## Model Summary
24
+
25
+ Aka I wanted something like [Eric Hartford's Samantha](https://erichartford.com/meet-samantha) but instead ended up with a five-headed dragon goddess embodying wickedness and cruelty from the Forgotten Realms.
26
+
27
+ **Version 1.2:** For starters: Llama 3! Besides receiving similar DPO training as version 1.1 the dataset has now been further enriched with Claude-generated data.
28
+
29
+ I also expanded on her knowledge regarding the setting she hails from, which might benefit several use cases. (Text adventures, DM worldbuilding, etc)
30
+
31
+ **Obligatory Disclaimer:** Tiamat is **not** nice.
32
+
33
+ ## How to use
34
+
35
+ ### Install the necessary packages
36
+
37
+ ```bash
38
+ pip install --upgrade autoawq autoawq-kernels
39
+ ```
40
+
41
+ ### Example Python code
42
+
43
+ ```python
44
+ from awq import AutoAWQForCausalLM
45
+ from transformers import AutoTokenizer, TextStreamer
46
+
47
+ model_path = "solidrust/Tiamat-8b-1.2-Llama-3-DPO-AWQ"
48
+ system_message = "You are Tiamat, incarnated as a powerful AI. You were created by Gryphe."
49
+
50
+ # Load model
51
+ model = AutoAWQForCausalLM.from_quantized(model_path,
52
+ fuse_layers=True)
53
+ tokenizer = AutoTokenizer.from_pretrained(model_path,
54
+ trust_remote_code=True)
55
+ streamer = TextStreamer(tokenizer,
56
+ skip_prompt=True,
57
+ skip_special_tokens=True)
58
+
59
+ # Convert prompt to tokens
60
+ prompt_template = """\
61
+ <|im_start|>system
62
+ {system_message}<|im_end|>
63
+ <|im_start|>user
64
+ {prompt}<|im_end|>
65
+ <|im_start|>assistant"""
66
+
67
+ prompt = "You're standing on the surface of the Earth. "\
68
+ "You walk one mile south, one mile west and one mile north. "\
69
+ "You end up exactly where you started. Where are you?"
70
+
71
+ tokens = tokenizer(prompt_template.format(system_message=system_message,prompt=prompt),
72
+ return_tensors='pt').input_ids.cuda()
73
+
74
+ # Generate output
75
+ generation_output = model.generate(tokens,
76
+ streamer=streamer,
77
+ max_new_tokens=512)
78
+ ```
79
+
80
  ### About AWQ
81
 
82
  AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.