File size: 7,544 Bytes
8945e1d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
Quantization made by Richard Erkhov.
[Github](https://github.com/RichardErkhov)
[Discord](https://discord.gg/pvy7H8DZMG)
[Request more models](https://github.com/RichardErkhov/quant_request)
NeuralLLaMa-3-8b-DT-v0.1 - GGUF
- Model creator: https://huggingface.co/Kukedlc/
- Original model: https://huggingface.co/Kukedlc/NeuralLLaMa-3-8b-DT-v0.1/
| Name | Quant method | Size |
| ---- | ---- | ---- |
| [NeuralLLaMa-3-8b-DT-v0.1.Q2_K.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q2_K.gguf) | Q2_K | 2.96GB |
| [NeuralLLaMa-3-8b-DT-v0.1.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.IQ3_XS.gguf) | IQ3_XS | 3.28GB |
| [NeuralLLaMa-3-8b-DT-v0.1.IQ3_S.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.IQ3_S.gguf) | IQ3_S | 3.43GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q3_K_S.gguf) | Q3_K_S | 3.41GB |
| [NeuralLLaMa-3-8b-DT-v0.1.IQ3_M.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.IQ3_M.gguf) | IQ3_M | 3.52GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q3_K.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q3_K.gguf) | Q3_K | 3.74GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q3_K_M.gguf) | Q3_K_M | 3.74GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q3_K_L.gguf) | Q3_K_L | 4.03GB |
| [NeuralLLaMa-3-8b-DT-v0.1.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.IQ4_XS.gguf) | IQ4_XS | 4.18GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q4_0.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q4_0.gguf) | Q4_0 | 4.34GB |
| [NeuralLLaMa-3-8b-DT-v0.1.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.IQ4_NL.gguf) | IQ4_NL | 4.38GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q4_K_S.gguf) | Q4_K_S | 4.37GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q4_K.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q4_K.gguf) | Q4_K | 4.58GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q4_K_M.gguf) | Q4_K_M | 4.58GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q4_1.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q4_1.gguf) | Q4_1 | 4.78GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q5_0.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q5_0.gguf) | Q5_0 | 5.21GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q5_K_S.gguf) | Q5_K_S | 5.21GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q5_K.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q5_K.gguf) | Q5_K | 5.34GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q5_K_M.gguf) | Q5_K_M | 5.34GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q5_1.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q5_1.gguf) | Q5_1 | 5.65GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q6_K.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q6_K.gguf) | Q6_K | 6.14GB |
| [NeuralLLaMa-3-8b-DT-v0.1.Q8_0.gguf](https://huggingface.co/RichardErkhov/Kukedlc_-_NeuralLLaMa-3-8b-DT-v0.1-gguf/blob/main/NeuralLLaMa-3-8b-DT-v0.1.Q8_0.gguf) | Q8_0 | 7.95GB |
Original model description:
---
tags:
- merge
- mergekit
- lazymergekit
- mlabonne/ChimeraLlama-3-8B-v2
- nbeerbower/llama-3-stella-8B
- uygarkurt/llama-3-merged-linear
base_model:
- mlabonne/ChimeraLlama-3-8B-v2
- nbeerbower/llama-3-stella-8B
- uygarkurt/llama-3-merged-linear
license: other
---
# NeuralLLaMa-3-8b-DT-v0.1
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64d71ab4089bc502ceb44d29/tK72e9RGnYyBVRy0T_Kba.png)
NeuralLLaMa-3-8b-DT-v0.1 is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
* [mlabonne/ChimeraLlama-3-8B-v2](https://huggingface.co/mlabonne/ChimeraLlama-3-8B-v2)
* [nbeerbower/llama-3-stella-8B](https://huggingface.co/nbeerbower/llama-3-stella-8B)
* [uygarkurt/llama-3-merged-linear](https://huggingface.co/uygarkurt/llama-3-merged-linear)
## 🧩 Configuration
```yaml
models:
- model: NousResearch/Meta-Llama-3-8B
# No parameters necessary for base model
- model: mlabonne/ChimeraLlama-3-8B-v2
parameters:
density: 0.33
weight: 0.2
- model: nbeerbower/llama-3-stella-8B
parameters:
density: 0.44
weight: 0.4
- model: uygarkurt/llama-3-merged-linear
parameters:
density: 0.55
weight: 0.4
merge_method: dare_ties
base_model: NousResearch/Meta-Llama-3-8B
parameters:
int8_mask: true
dtype: float16
```
## 🗨️ Chats
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64d71ab4089bc502ceb44d29/Uk89jeeRZ3Zh3wNBm6dXk.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64d71ab4089bc502ceb44d29/feYEkbM_TqeahAMOoiGoG.png)
## 💻 Usage
```python
!pip install -qU transformers accelerate bitsandbytes
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, BitsAndBytesConfig
import torch
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
MODEL_NAME = 'Kukedlc/NeuralLLaMa-3-8b-DT-v0.1'
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map='cuda:0', quantization_config=bnb_config)
prompt_system = "You are an advanced language model that speaks Spanish fluently, clearly, and precisely.\
You are called Roberto the Robot and you are an aspiring post-modern artist."
prompt = "Create a piece of art that represents how you see yourself, Roberto, as an advanced LLm, with ASCII art, mixing diagrams, engineering and let yourself go."
chat = [
{"role": "system", "content": f"{prompt_system}"},
{"role": "user", "content": f"{prompt}"},
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(chat, return_tensors="pt").to('cuda')
streamer = TextStreamer(tokenizer)
stop_token = "<|eot_id|>"
stop = tokenizer.encode(stop_token)[0]
_ = model.generate(**inputs, streamer=streamer, max_new_tokens=1024, do_sample=True, temperature=0.7, repetition_penalty=1.2, top_p=0.9, eos_token_id=stop)
```
|