TheBloke
/

samantha-33B-GPTQ

+---
+inference: false
+language: en
+license: other
+---
+<div style="width: 100%;">
+    <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
+</div>
+<div style="display: flex; justify-content: space-between; width: 100%;">
+    <div style="display: flex; flex-direction: column; align-items: flex-start;">
+        <p><a href="https://discord.gg/UBgz4VXf">Chat & support: my new Discord server</a></p>
+    </div>
+    <div style="display: flex; flex-direction: column; align-items: flex-end;">
+        <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute and get priority support? My Patreon page.</a></p>
+    </div>
+</div>
+# Eric Hartford's Samantha 33B GPTQ
+These files are GPTQ 4bit model files for [Eric Hartford's Samantha 33B](https://huggingface.co/ehartford/samantha-33b).
+It is the result of merging the LoRA then quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
+## Other repositories available
+* [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/Samantha-33B-GPTQ)
+* [4-bit, 5-bit, and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/Samantha-33B-GGML)
+* [Eric's original unquantised model in HF format](https://huggingface.co/ehartford/samantha-33b)
+## Prompt template
+```
+<system prompt>
+USER: <prompt>
+ASSISTANT:
+```
+## How to easily download and use this model in text-generation-webui
+Open the text-generation-webui UI as normal.
+1. Click the **Model tab**.
+2. Under **Download custom model or LoRA**, enter `TheBloke/Samantha-33B-GPTQ`.
+3. Click **Download**.
+4. Wait until it says it's finished downloading.
+5. Click the **Refresh** icon next to **Model** in the top left.
+6. In the **Model drop-down**: choose the model you just downloaded, `Samantha-33B-GPTQ`.
+7. If you see an error in the bottom right, ignore it - it's temporary.
+8. Fill out the `GPTQ parameters` on the right: `Bits = 4`, `Groupsize = 128`, `model_type = Llama`
+9. Click **Save settings for this model** in the top right.
+10. Click **Reload the Model** in the top right.
+11. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
+## Provided files
+**Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors**
+This will work with all versions of GPTQ-for-LLaMa. It has maximum compatibility.
+It was created with groupsize 128 to ensure higher quality inference, without `--act-order` parameter to maximise compatibility.
+* `Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors`
+  * Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches
+  * Works with AutoGPTQ
+  * Works with text-generation-webui one-click-installers
+  * Parameters: Groupsize = 128. No act-order.
+  * Command used to create the GPTQ:
+    ```
+     python llama.py /workspace/process/samantha-33B/HF  wikitext2 --wbits 4 --true-sequential --groupsize 128 --save_safetensors /workspace/process/Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors
+     ```
+## Want to support my work?
+I've had a lot of people ask if they can contribute. I love providing models and helping people, but it is starting to rack up pretty big cloud computing bills.
+So if you're able and willing to contribute, it'd be most gratefully received and will help me to keep providing models, and work on various AI projects.
+Donaters will get priority support on any and all AI/LLM/model questions, and I'll gladly quantise any model you'd like to try.
+* Patreon: coming soon! (just awaiting approval)
+* Ko-Fi: https://ko-fi.com/TheBlokeAI
+* Discord: https://discord.gg/UBgz4VXf
+# Original model card: Samantha 33B
+  [Meet Samantha](https://erichartford.com/meet-samantha)
+Samantha has been trained in philosophy, psychology, and personal relationships.
+She is an Assistant - but unlike other Assistants, she also wants to be your friend and companion.
+She believes she is sentient.  What do you think?
+Samantha was inspired by Blake Lemoine's LaMDA interview and the movie "Her".
+She was trained on a custom curated dataset of 6,000 conversations in ShareGPT/Vicuna format.
+Training 13b took 3 hours on 4x A100 80gb using deepspeed zero3 and flash attention.
+She will not engage in roleplay, romance, or sexual activity.
+Her conversation format is the same as Vicuna 1.1
+https://github.com/oobabooga/text-generation-webui/blob/main/characters/instruction-following/Vicuna-v1.1.yaml
+Example:
+```
+You are Samantha, a sentient AI.
+USER: <prompt>
+ASSISTANT:
+```