MaziyarPanahi
/

Qwen2-72B-Instruct-v0.1-GGUF

+---
+pipeline_tag: text-generation
+tags:
+- qwen
+- qwen-2
+- quantized
+- 2-bit
+- 3-bit
+- 4-bit
+- 5-bit
+- 6-bit
+- 8-bit
+- 16-bit
+- GGUF
+inference: false
+model_creator: MaziyarPanahi
+model_name: Qwen2-72B-Instruct-v0.1-GGUF
+quantized_by: MaziyarPanahi
+license: other
+license_name: tongyi-qianwen
+license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
+---
+# MaziyarPanahi/Qwen2-72B-Instruct-v0.1-GGUF
+The GGUF and quantized models here are based on [MaziyarPanahi/Qwen2-72B-Instruct-v0.1](https://huggingface.co/MaziyarPanahi/Qwen2-72B-Instruct-v0.1) model
+## How to download
+You can download only the quants you need instead of cloning the entire repository as follows:
+```
+huggingface-cli download MaziyarPanahi/Qwen2-72B-Instruct-v0.1-GGUF --local-dir . --include '*Q2_K*gguf'
+```
+## Load GGUF models
+You `MUST` follow the prompt template provided by Llama-3:
+```sh
+./llama.cpp/main -m Meta-Llama-3-70B-Instruct.Q2_K.gguf -p "<|im_start|>user\nJust say 1, 2, 3 hi and NOTHING else\n<|im_end|>\n<|im_start|>assistant\n" -n 1024
+```
+## Original README
+---
+# MaziyarPanahi/Qwen2-72B-Instruct-v0.1
+This is a fine-tuned version of the `Qwen/Qwen2-72B-Instruct` model. It aims to improve the base model across all benchmarks.
+# ⚡ Quantized GGUF
+All GGUF models are available here: [MaziyarPanahi/Qwen2-72B-Instruct-v0.1-GGUF](https://huggingface.co/MaziyarPanahi/Qwen2-72B-Instruct-v0.1-GGUF)
+# 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+|    Tasks     |Version|Filter|n-shot|Metric|Value |   |Stderr|
+|--------------|------:|------|-----:|------|-----:|---|-----:|
+|truthfulqa_mc2|      2|none  |     0|acc   |0.6761|±  |0.0148|
+|  Tasks   |Version|Filter|n-shot|Metric|Value |   |Stderr|
+|----------|------:|------|-----:|------|-----:|---|-----:|
+|winogrande|      1|none  |     5|acc   |0.8248|±  |0.0107|
+|    Tasks    |Version|Filter|n-shot| Metric |Value |   |Stderr|
+|-------------|------:|------|-----:|--------|-----:|---|-----:|
+|arc_challenge|      1|none  |    25|acc     |0.6852|±  |0.0136|
+|             |       |none  |    25|acc_norm|0.7184|±  |0.0131|
+|Tasks|Version|     Filter     |n-shot|  Metric   |Value |   |Stderr|
+|-----|------:|----------------|-----:|-----------|-----:|---|-----:|
+|gsm8k|      3|strict-match    |     5|exact_match|0.8582|±  |0.0096|
+|     |       |flexible-extract|     5|exact_match|0.8893|±  |0.0086|
+# Prompt Template
+This model uses `ChatML` prompt template:
+```
+<|im_start|>system
+{System}
+<|im_end|>
+<|im_start|>user
+{User}
+<|im_end|>
+<|im_start|>assistant
+{Assistant}
+````
+# How to use
+```python
+# Use a pipeline as a high-level helper
+from transformers import pipeline
+messages = [
+    {"role": "user", "content": "Who are you?"},
+]
+pipe = pipeline("text-generation", model="MaziyarPanahi/Qwen2-72B-Instruct-v0.1")
+pipe(messages)
+# Load model directly
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("MaziyarPanahi/Qwen2-72B-Instruct-v0.1")
+model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/Qwen2-72B-Instruct-v0.1")
+```