mobicham commited on
Commit
70e587f
1 Parent(s): 67db312

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -0
README.md CHANGED
@@ -8,6 +8,7 @@ pipeline_tag: text-generation
8
  ## Llama-2-70b-chat-hf-2bit_g16_s128-HQQ
9
  This is a version of the LLama-2-70B-chat-hf model quantized to 2-bit via Half-Quadratic Quantization (HQQ): https://mobiusml.github.io/hqq_blog/
10
 
 
11
  To run the model, install the HQQ library from https://github.com/mobiusml/hqq and use it as follows:
12
  ``` Python
13
  from hqq.models.llama_hf import LlamaHQQ
@@ -20,6 +21,97 @@ tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)
20
  model = LlamaHQQ.from_quantized(model_id)
21
  ```
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  *Limitations*: <br>
24
  -Only supports single GPU runtime.<br>
25
  -Not compatible with HuggingFace's PEFT.<br>
 
8
  ## Llama-2-70b-chat-hf-2bit_g16_s128-HQQ
9
  This is a version of the LLama-2-70B-chat-hf model quantized to 2-bit via Half-Quadratic Quantization (HQQ): https://mobiusml.github.io/hqq_blog/
10
 
11
+ ### Basic Usage
12
  To run the model, install the HQQ library from https://github.com/mobiusml/hqq and use it as follows:
13
  ``` Python
14
  from hqq.models.llama_hf import LlamaHQQ
 
21
  model = LlamaHQQ.from_quantized(model_id)
22
  ```
23
 
24
+ ### Basic Chat Example
25
+ ``` Python
26
+ import transformers
27
+ from hqq.models.llama_hf import LlamaHQQ
28
+
29
+ model_id = 'mobiuslabsgmbh/Llama-2-70b-chat-hf-2bit_g16_s128-HQQ'
30
+ #Load the tokenizer
31
+ tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)
32
+ #Load the model
33
+ model = LlamaHQQ.from_quantized(model_id)
34
+
35
+ ##########################################################################################################
36
+ from threading import Thread
37
+
38
+ from sys import stdout
39
+ def print_flush(data):
40
+ stdout.write("\r" + data)
41
+ stdout.flush()
42
+
43
+ #Adapted from https://huggingface.co/spaces/huggingface-projects/llama-2-7b-chat/blob/main/app.py
44
+ def process_conversation(chat):
45
+ system_prompt = chat['system_prompt']
46
+ chat_history = chat['chat_history']
47
+ message = chat['message']
48
+
49
+ conversation = []
50
+ if system_prompt:
51
+ conversation.append({"role": "system", "content": system_prompt})
52
+ for user, assistant in chat_history:
53
+ conversation.extend([{"role": "user", "content": user}, {"role": "assistant", "content": assistant}])
54
+ conversation.append({"role": "user", "content": message})
55
+
56
+ return tokenizer.apply_chat_template(conversation, return_tensors="pt").to('cuda')
57
+
58
+ def chat_processor(chat, max_new_tokens=100, do_sample=True):
59
+ tokenizer.use_default_system_prompt = False
60
+ streamer = transformers.TextIteratorStreamer(tokenizer, timeout=10.0, skip_prompt=True, skip_special_tokens=True)
61
+
62
+ generate_params = dict(
63
+ {"input_ids": process_conversation(chat)},
64
+ streamer=streamer,
65
+ max_new_tokens=max_new_tokens,
66
+ do_sample=do_sample,
67
+ top_p=0.90,
68
+ top_k=50,
69
+ temperature= 0.6,
70
+ num_beams=1,
71
+ repetition_penalty=1.2,
72
+ )
73
+
74
+ t = Thread(target=model.generate, kwargs=generate_params)
75
+ t.start()
76
+
77
+ outputs = []
78
+ for text in streamer:
79
+ outputs.append(text)
80
+ print_flush("".join(outputs))
81
+
82
+ return outputs
83
+
84
+ ###################################################################################################
85
+
86
+ outputs = chat_processor({'system_prompt':"You are a helpful assistant.",
87
+ 'chat_history':[],
88
+ 'message':"How can I build a car?"
89
+ },
90
+ max_new_tokens=1000, do_sample=False)
91
+ ```
92
+
93
+ <b>Output</b>:
94
+ <p>
95
+ Building a car is a complex process that involves designing, prototyping, testing, and manufacturing. Here are some general steps you can follow to build a car:
96
+
97
+ 1. Design the car: Determine the type of car you want to build, including the size, shape, and features. Create a detailed set of blueprints or computer-aided design (CAD) drawings to guide your building process.
98
+ 2. Source materials: Purchase or gather all the necessary materials, such as steel, aluminum, rubber, plastics, and any other components required for the car's body, frame, and engine.
99
+ 3. Build the frame: Construct the frame, which is the foundation of the car. This includes creating the chassis, suspension, and steering systems.
100
+ 4. Install the engine: Choose an appropriate engine and install it in the frame. Connect the engine to the transmission, exhaust system, and cooling system.
101
+ 5. Add the body: Attach the body panels to the frame, including the hood, doors, trunk lid, and roof. Ensure proper alignment and fitment.
102
+ 6. Install the electrical system: Connect the battery, starter, alternator, and wiring harness to the engine and other components. Install headlights, taillights, and other electrical accessories.
103
+ 7. Add the brakes: Install the brake system, including the brake pads, rotors, calipers, and master cylinder. Connect the brake lines and bleed the system to remove air bubbles.
104
+ 8. Install the interior: Fit the seats, dashboard, carpeting, and other interior components. Install the steering column, pedals, and shifter.
105
+ 9. Test and inspect: Check the car's systems, including the brakes, suspension, and engine performance. Make sure everything is functioning properly and safely.
106
+ 10. Register and insure: Obtain registration and insurance for your newly built car. Comply with local regulations and laws regarding vehicle ownership and operation.
107
+
108
+ Please note that this is a high-level overview of the process, and building a car can be a complex and time-consuming task. It requires specialized knowledge, skills, and tools, as well as a clean and organized workspace. Additionally, safety precautions should always be taken when working on vehicles, as they can be dangerous if mishandled.
109
+
110
+ If you are not experienced in automotive construction, it may be advisable to seek guidance from professionals or take a course in automotive mechanics before attempting to build a car.
111
+
112
+ ----------------------------------------------------------------------------------------------------------------------------------
113
+ </p>
114
+
115
  *Limitations*: <br>
116
  -Only supports single GPU runtime.<br>
117
  -Not compatible with HuggingFace's PEFT.<br>