cgus
/

Qwen2-7B-Instruct-abliterated-iMat-GGUF

Text Generation

Model card Files Files and versions Community

cgus commited on Jun 18, 2024

Commit

5f4ab0c

·

verified ·

1 Parent(s): 214629e

Create README.md

Files changed (1) hide show

README.md +79 -0

README.md ADDED Viewed

	@@ -0,0 +1,79 @@

+---
+license: apache-2.0
+language:
+- en
+pipeline_tag: text-generation
+base_model: natong19/Qwen2-7B-Instruct-abliterated
+inference: false
+tags:
+- chat
+---
+# Qwen2-7B-Instruct-abliterated-GGUF
+Model: [Qwen2-7B-Instruct-abliterated](https://huggingface.co/natong19/Qwen2-7B-Instruct-abliterated)
+Made by: [natong19](https://huggingface.co/natong19)
+Based on original model: [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct)
+Created by: [Qwen](https://huggingface.co/Qwen)
+## Quantization notes
+Made with llama.cpp-b3154 with imatrix file based on Exllamav2 calibration file.
+# Original model card
+# Qwen2-7B-Instruct-abliterated
+## Introduction
+Abliterated version of [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) using [failspy](https://huggingface.co/failspy)'s notebook.
+The model's strongest refusal directions have been ablated via weight orthogonalization, but the model may still refuse your request, misunderstand your intent, or provide unsolicited advice regarding ethics or safety.
+## Quickstart
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_id = "natong19/Qwen2-7B-Instruct-abliterated"
+device = "cuda" # the device to load the model onto
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype="auto",
+    device_map="auto"
+)
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+prompt = "Give me a short introduction to large language model."
+messages = [
+    {"role": "system", "content": "You are a helpful assistant."},
+    {"role": "user", "content": prompt}
+]
+text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True
+)
+model_inputs = tokenizer([text], return_tensors="pt").to(device)
+generated_ids = model.generate(
+    model_inputs.input_ids,
+    max_new_tokens=256
+)
+generated_ids = [
+    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
+]
+response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+print(response)
+```
+## Evaluation
+Evaluation framework: lm-evaluation-harness 0.4.2
+| Datasets | Qwen2-7B-Instruct | Qwen2-7B-Instruct-abliterated |
+| :--- | :---: | :---: |
+| ARC (25-shot) | 62.5 | 62.5 |
+| GSM8K (5-shot) | 73.0 | 72.2 |
+| HellaSwag (10-shot) | 81.8 | 81.7 |
+| MMLU (5-shot) | 70.7 | 70.5 |
+| TruthfulQA (0-shot) | 57.3 | 55.0 |
+| Winogrande (5-shot) | 76.2 | 77.4 |