MaziyarPanahi commited on
Commit
fafe5de
1 Parent(s): 336a3f1

Create README.md (#2)

Browse files

- Create README.md (a57d84d49bb750b51b4c4651a31d95ec66db02b2)

Files changed (1) hide show
  1. README.md +119 -0
README.md ADDED
@@ -0,0 +1,119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ tags:
4
+ - qwen
5
+ - qwen-2
6
+ - quantized
7
+ - 2-bit
8
+ - 3-bit
9
+ - 4-bit
10
+ - 5-bit
11
+ - 6-bit
12
+ - 8-bit
13
+ - 16-bit
14
+ - GGUF
15
+ inference: false
16
+ model_creator: MaziyarPanahi
17
+ model_name: Qwen2-72B-Instruct-v0.1-GGUF
18
+ quantized_by: MaziyarPanahi
19
+ license: other
20
+ license_name: tongyi-qianwen
21
+ license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
22
+ ---
23
+
24
+
25
+ # MaziyarPanahi/Qwen2-72B-Instruct-v0.1-GGUF
26
+
27
+ The GGUF and quantized models here are based on [MaziyarPanahi/Qwen2-72B-Instruct-v0.1](https://huggingface.co/MaziyarPanahi/Qwen2-72B-Instruct-v0.1) model
28
+
29
+ ## How to download
30
+ You can download only the quants you need instead of cloning the entire repository as follows:
31
+
32
+ ```
33
+ huggingface-cli download MaziyarPanahi/Qwen2-72B-Instruct-v0.1-GGUF --local-dir . --include '*Q2_K*gguf'
34
+ ```
35
+
36
+ ## Load GGUF models
37
+
38
+ You `MUST` follow the prompt template provided by Llama-3:
39
+
40
+
41
+ ```sh
42
+ ./llama.cpp/main -m Meta-Llama-3-70B-Instruct.Q2_K.gguf -p "<|im_start|>user\nJust say 1, 2, 3 hi and NOTHING else\n<|im_end|>\n<|im_start|>assistant\n" -n 1024
43
+ ```
44
+
45
+
46
+
47
+
48
+ ## Original README
49
+
50
+ ---
51
+
52
+ # MaziyarPanahi/Qwen2-72B-Instruct-v0.1
53
+
54
+ This is a fine-tuned version of the `Qwen/Qwen2-72B-Instruct` model. It aims to improve the base model across all benchmarks.
55
+
56
+ # ⚡ Quantized GGUF
57
+
58
+ All GGUF models are available here: [MaziyarPanahi/Qwen2-72B-Instruct-v0.1-GGUF](https://huggingface.co/MaziyarPanahi/Qwen2-72B-Instruct-v0.1-GGUF)
59
+
60
+ # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
61
+
62
+
63
+
64
+ | Tasks |Version|Filter|n-shot|Metric|Value | |Stderr|
65
+ |--------------|------:|------|-----:|------|-----:|---|-----:|
66
+ |truthfulqa_mc2| 2|none | 0|acc |0.6761|± |0.0148|
67
+
68
+ | Tasks |Version|Filter|n-shot|Metric|Value | |Stderr|
69
+ |----------|------:|------|-----:|------|-----:|---|-----:|
70
+ |winogrande| 1|none | 5|acc |0.8248|± |0.0107|
71
+
72
+ | Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
73
+ |-------------|------:|------|-----:|--------|-----:|---|-----:|
74
+ |arc_challenge| 1|none | 25|acc |0.6852|± |0.0136|
75
+ | | |none | 25|acc_norm|0.7184|± |0.0131|
76
+
77
+ |Tasks|Version| Filter |n-shot| Metric |Value | |Stderr|
78
+ |-----|------:|----------------|-----:|-----------|-----:|---|-----:|
79
+ |gsm8k| 3|strict-match | 5|exact_match|0.8582|± |0.0096|
80
+ | | |flexible-extract| 5|exact_match|0.8893|± |0.0086|
81
+
82
+ # Prompt Template
83
+
84
+ This model uses `ChatML` prompt template:
85
+
86
+ ```
87
+ <|im_start|>system
88
+ {System}
89
+ <|im_end|>
90
+ <|im_start|>user
91
+ {User}
92
+ <|im_end|>
93
+ <|im_start|>assistant
94
+ {Assistant}
95
+ ````
96
+
97
+ # How to use
98
+
99
+
100
+ ```python
101
+
102
+ # Use a pipeline as a high-level helper
103
+
104
+ from transformers import pipeline
105
+
106
+ messages = [
107
+ {"role": "user", "content": "Who are you?"},
108
+ ]
109
+ pipe = pipeline("text-generation", model="MaziyarPanahi/Qwen2-72B-Instruct-v0.1")
110
+ pipe(messages)
111
+
112
+
113
+ # Load model directly
114
+
115
+ from transformers import AutoTokenizer, AutoModelForCausalLM
116
+
117
+ tokenizer = AutoTokenizer.from_pretrained("MaziyarPanahi/Qwen2-72B-Instruct-v0.1")
118
+ model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/Qwen2-72B-Instruct-v0.1")
119
+ ```