koesn commited on
Commit
b065b90
1 Parent(s): 7c6b158

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +99 -0
README.md CHANGED
@@ -1,3 +1,102 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+
6
+ ## Description
7
+ This repo contains GGUF format model files for NeuralDarewin-7B.
8
+
9
+ ## Files Provided
10
+ | Name | Quant | Bits | File Size | Remark |
11
+ | ---------------------------- | ------- | ---- | --------- | -------------------------------- |
12
+ | neuraldarewin-7b.IQ3_XXS.gguf | IQ3_XXS | 3 | 3.02 GB | 3.06 bpw quantization |
13
+ | neuraldarewin-7b.IQ3_S.gguf | IQ3_S | 3 | 3.18 GB | 3.44 bpw quantization |
14
+ | neuraldarewin-7b.IQ3_M.gguf | IQ3_M | 3 | 3.28 GB | 3.66 bpw quantization mix |
15
+ | neuraldarewin-7b.Q4_0.gguf | Q4_0 | 4 | 4.11 GB | 3.56G, +0.2166 ppl |
16
+ | neuraldarewin-7b.IQ4_NL.gguf | IQ4_NL | 4 | 4.16 GB | 4.25 bpw non-linear quantization |
17
+ | neuraldarewin-7b.Q4_K_M.gguf | Q4_K_M | 4 | 4.37 GB | 3.80G, +0.0532 ppl |
18
+ | neuraldarewin-7b.Q5_K_M.gguf | Q5_K_M | 5 | 5.13 GB | 4.45G, +0.0122 ppl |
19
+ | neuraldarewin-7b.Q6_K.gguf | Q6_K | 6 | 5.94 GB | 5.15G, +0.0008 ppl |
20
+ | neuraldarewin-7b.Q8_0.gguf | Q8_0 | 8 | 7.70 GB | 6.70G, +0.0004 ppl |
21
+
22
+ ## Parameters
23
+ | path | type | architecture | rope_theta | sliding_win | max_pos_embed |
24
+ | ---------------------------- | ------- | ------------------ | ---------- | ----------- | ------------- |
25
+ | mlabonne/Darewin-7B | mistral | MistralForCausalLM | 10000.0 | 4096 | 32768 |
26
+
27
+ ## Benchmarks
28
+ ![](https://i.ibb.co/gjKpkcj/Neural-Darewin-7-B-GGUF.png)
29
+
30
+
31
+ # Original Model Card
32
+
33
+ Darewin-7B is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
34
+ * [Intel/neural-chat-7b-v3-3](https://huggingface.co/Intel/neural-chat-7b-v3-3)
35
+ * [openaccess-ai-collective/DPOpenHermes-7B-v2](https://huggingface.co/openaccess-ai-collective/DPOpenHermes-7B-v2)
36
+ * [fblgit/una-cybertron-7b-v2-bf16](https://huggingface.co/fblgit/una-cybertron-7b-v2-bf16)
37
+ * [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106)
38
+ * [OpenPipe/mistral-ft-optimized-1227](https://huggingface.co/OpenPipe/mistral-ft-optimized-1227)
39
+ * [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B)
40
+
41
+ ## 🧩 Configuration
42
+
43
+ ```yaml
44
+ models:
45
+ - model: mistralai/Mistral-7B-v0.1
46
+ # No parameters necessary for base model
47
+ - model: Intel/neural-chat-7b-v3-3
48
+ parameters:
49
+ density: 0.6
50
+ weight: 0.2
51
+ - model: openaccess-ai-collective/DPOpenHermes-7B-v2
52
+ parameters:
53
+ density: 0.6
54
+ weight: 0.1
55
+ - model: fblgit/una-cybertron-7b-v2-bf16
56
+ parameters:
57
+ density: 0.6
58
+ weight: 0.2
59
+ - model: openchat/openchat-3.5-0106
60
+ parameters:
61
+ density: 0.6
62
+ weight: 0.15
63
+ - model: OpenPipe/mistral-ft-optimized-1227
64
+ parameters:
65
+ density: 0.6
66
+ weight: 0.25
67
+ - model: mlabonne/NeuralHermes-2.5-Mistral-7B
68
+ parameters:
69
+ density: 0.6
70
+ weight: 0.1
71
+ merge_method: dare_ties
72
+ base_model: mistralai/Mistral-7B-v0.1
73
+ parameters:
74
+ int8_mask: true
75
+ dtype: bfloat16
76
+
77
+ ```
78
+
79
+ ## 💻 Usage
80
+
81
+ ```python
82
+ !pip install -qU transformers accelerate
83
+
84
+ from transformers import AutoTokenizer
85
+ import transformers
86
+ import torch
87
+
88
+ model = "mlabonne/NeuralDarewin-7B"
89
+ messages = [{"role": "user", "content": "What is a large language model?"}]
90
+
91
+ tokenizer = AutoTokenizer.from_pretrained(model)
92
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
93
+ pipeline = transformers.pipeline(
94
+ "text-generation",
95
+ model=model,
96
+ torch_dtype=torch.float16,
97
+ device_map="auto",
98
+ )
99
+
100
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
101
+ print(outputs[0]["generated_text"])
102
+ ```