aisensiy commited on
Commit
36b848d
1 Parent(s): 79bff4e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -2,6 +2,23 @@
2
  license: mit
3
  ---
4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
 
6
  ## Files are split and require joining
7
 
 
2
  license: mit
3
  ---
4
 
5
+ ## How to convert
6
+
7
+ First, you need git clone [llama.cpp](https://github.com/ggerganov/llama.cpp) and make it.
8
+
9
+ Then follow the instrution to generate gguf files.
10
+
11
+ ```
12
+ # convert Qwen HF models to gguf fp16 format
13
+ python convert-hf-to-gguf.py --outfile qwen7b-chat-f16.gguf --outtype f16 Qwen-7B-Chat
14
+
15
+ # quantize the model to 4-bits (using q4_0 method)
16
+ ./quantize qwen7b-chat-f16.gguf qwen7b-chat-q4_0.gguf q4_0
17
+
18
+ # chat with Qwen models
19
+ ./main -m qwen7b-chat-q4_0.gguf -n 512 --color -i -cml -f prompts/chat-with-qwen.txt
20
+ ```
21
+
22
 
23
  ## Files are split and require joining
24