Text Generation
Transformers
English
llama-2
code
Eval Results
Inference Endpoints
bartowski commited on
Commit
cd61e95
1 Parent(s): 45a7a37

Quant for 5.0

Browse files
.gitattributes CHANGED
@@ -1,35 +1,4 @@
1
- *.7z filter=lfs diff=lfs merge=lfs -text
2
- *.arrow filter=lfs diff=lfs merge=lfs -text
3
- *.bin filter=lfs diff=lfs merge=lfs -text
4
- *.bz2 filter=lfs diff=lfs merge=lfs -text
5
- *.ckpt filter=lfs diff=lfs merge=lfs -text
6
- *.ftz filter=lfs diff=lfs merge=lfs -text
7
- *.gz filter=lfs diff=lfs merge=lfs -text
8
- *.h5 filter=lfs diff=lfs merge=lfs -text
9
- *.joblib filter=lfs diff=lfs merge=lfs -text
10
- *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
- *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
- *.model filter=lfs diff=lfs merge=lfs -text
13
- *.msgpack filter=lfs diff=lfs merge=lfs -text
14
- *.npy filter=lfs diff=lfs merge=lfs -text
15
- *.npz filter=lfs diff=lfs merge=lfs -text
16
- *.onnx filter=lfs diff=lfs merge=lfs -text
17
- *.ot filter=lfs diff=lfs merge=lfs -text
18
- *.parquet filter=lfs diff=lfs merge=lfs -text
19
- *.pb filter=lfs diff=lfs merge=lfs -text
20
- *.pickle filter=lfs diff=lfs merge=lfs -text
21
- *.pkl filter=lfs diff=lfs merge=lfs -text
22
- *.pt filter=lfs diff=lfs merge=lfs -text
23
- *.pth filter=lfs diff=lfs merge=lfs -text
24
- *.rar filter=lfs diff=lfs merge=lfs -text
25
- *.safetensors filter=lfs diff=lfs merge=lfs -text
26
- saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
- *.tar.* filter=lfs diff=lfs merge=lfs -text
28
- *.tar filter=lfs diff=lfs merge=lfs -text
29
- *.tflite filter=lfs diff=lfs merge=lfs -text
30
- *.tgz filter=lfs diff=lfs merge=lfs -text
31
- *.wasm filter=lfs diff=lfs merge=lfs -text
32
- *.xz filter=lfs diff=lfs merge=lfs -text
33
- *.zip filter=lfs diff=lfs merge=lfs -text
34
- *.zst filter=lfs diff=lfs merge=lfs -text
35
- *tfevents* filter=lfs diff=lfs merge=lfs -text
 
1
+ model-00002-of-00002.safetensors filter=lfs diff=lfs merge=lfs -text
2
+ model-00001-of-00002.safetensors filter=lfs diff=lfs merge=lfs -text
3
+ output.safetensors filter=lfs diff=lfs merge=lfs -text
4
+ tokenizer.model filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -25,59 +25,102 @@ model-index:
25
  type: pass@1
26
  value: 34.146
27
  verified: false
28
- quantized_by: bartowski
29
  ---
30
 
31
- ## Exllama v2 Quantizations of speechless-mistral-dolphin-orca-platypus-samantha-7b
32
 
33
- Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.0.11">turboderp's ExLlamaV2 v0.0.11</a> for quantization.
 
 
34
 
35
- # The "main" branch only contains the measurement.json, download one of the other branches for the model (see below)
36
 
37
- Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions.
38
 
39
- Conversion was done using the default calibration dataset.
40
 
41
- Default arguments used except when the bits per weight is above 6.0, at that point the lm_head layer is quantized at 8 bits per weight instead of the default 6.
42
 
43
- Original model: https://huggingface.co/uukuguy/speechless-mistral-dolphin-orca-platypus-samantha-7b
 
 
44
 
 
45
 
 
46
 
47
- <a href="https://huggingface.co/bartowski/speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2/tree/8_0">8.0 bits per weight</a>
48
 
49
- <a href="https://huggingface.co/bartowski/speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2/tree/6_5">6.5 bits per weight</a>
50
 
51
- <a href="https://huggingface.co/bartowski/speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2/tree/5_0">5.0 bits per weight</a>
52
 
53
- <a href="https://huggingface.co/bartowski/speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2/tree/4_0">4.0 bits per weight</a>
54
 
55
- <a href="https://huggingface.co/bartowski/speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2/tree/3_5">3.5 bits per weight</a>
56
 
57
- ## Download instructions
58
 
59
- With git:
60
 
61
- ```shell
62
- git clone --single-branch --branch 4_0 https://huggingface.co/bartowski/speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2
63
- ```
64
 
65
- With huggingface hub (credit to TheBloke for instructions):
 
 
 
 
 
 
 
 
 
66
 
67
- ```shell
68
- pip3 install huggingface-hub
69
- ```
70
 
71
- To download the `main` (only useful if you only care about measurement.json) branch to a folder called `speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2`:
 
72
 
73
- ```shell
74
- mkdir speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2
75
- huggingface-cli download bartowski/speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2 --local-dir speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2 --local-dir-use-symlinks False
76
- ```
77
 
78
- To download from a different branch, add the `--revision` parameter:
79
 
80
- ```shell
81
- mkdir speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2
82
- huggingface-cli download bartowski/speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2 --revision 4_0 --local-dir speechless-mistral-dolphin-orca-platypus-samantha-7b-exl2 --local-dir-use-symlinks False
83
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  type: pass@1
26
  value: 34.146
27
  verified: false
 
28
  ---
29
 
30
+ <p><h1> speechless-mistral-dolphin-orca-platypus-samantha-7b </h1></p>
31
 
32
+ * [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/speechless-mistral-dolphin-orca-platypus-samantha-7B-AWQ)
33
+ * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/speechless-mistral-dolphin-orca-platypus-samantha-7B-GPTQ)
34
+ * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/speechless-mistral-dolphin-orca-platypus-samantha-7B-GGUF)
35
 
36
+ This model is a merge of ehartford/dolphin-2.1-mistral-7b, Open-Orca/Mistral-7B-OpenOrca, bhenrym14/mistral-7b-platypus-fp16 and ehartford/samantha-1.2-mistral-7b.
37
 
38
+ I'm very sorry for giving such a long and peculiar name. Originally, it was just my lazy behavior during the process of making models to easily distinguish various model and dataset combinations. I didn't expect the [previous model](https://huggingface.co/uukuguy/speechless-llama2-hermes-orca-platypus-wizardlm-13b) ([Thebloke GPTQ Version](https://huggingface.co/TheBloke/Speechless-Llama2-Hermes-Orca-Platypus-WizardLM-13B-GPTQ)) to be so popular. This time, based on some guys's request, I am releasing a model based on Mistral, and I have also inherited the style of the super long name along with it. Welcome to try the model, please refrain from harsh criticism if you don't like it.
39
 
40
+ Code: https://github.com/uukuguy/speechless
41
 
42
+ ## HumanEval
43
 
44
+ | Metric | Value |
45
+ | --- | --- |
46
+ | humaneval-python | 34.146|
47
 
48
+ [Big Code Models Leaderboard](https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard)
49
 
50
+ CodeLlama-34B-Python: 53.29
51
 
52
+ CodeLlama-34B-Instruct: 50.79
53
 
54
+ CodeLlama-13B-Instruct: 50.6
55
 
56
+ CodeLlama-34B: 45.11
57
 
58
+ CodeLlama-13B-Python: 42.89
59
 
60
+ CodeLlama-13B: 35.07
61
 
62
+ Mistral-7B-v0.1: 30.488
63
 
64
+ ## LM-Evaluation-Harness
65
 
66
+ [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 
 
67
 
68
+ | Metric | Value |
69
+ | --- | --- |
70
+ | ARC | 64.33 |
71
+ | HellaSwag | 84.4|
72
+ | MMLU | 63.72 |
73
+ | TruthfulQA | 52.52|
74
+ | Winogrande | 78.37 |
75
+ | GSM8K | 21.38 |
76
+ | DROP | 8.66 |
77
+ | Average | 53.34 |
78
 
79
+ # Model Card for Mistral-7B-v0.1
 
 
80
 
81
+ The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters.
82
+ Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.
83
 
84
+ For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/announcing-mistral-7b/).
 
 
 
85
 
86
+ ## Model Architecture
87
 
88
+ Mistral-7B-v0.1 is a transformer model, with the following architecture choices:
89
+ - Grouped-Query Attention
90
+ - Sliding-Window Attention
91
+ - Byte-fallback BPE tokenizer
92
+
93
+ ## Troubleshooting
94
+
95
+ - If you see the following error:
96
+ ``
97
+ KeyError: 'mistral'
98
+ ``
99
+ - Or:
100
+ ``
101
+ NotImplementedError: Cannot copy out of meta tensor; no data!
102
+ ``
103
+
104
+ Ensure you are utilizing a stable version of Transformers, 4.34.0 or newer.
105
+
106
+ ## Notice
107
+
108
+ Mistral 7B is a pretrained base model and therefore does not have any moderation mechanisms.
109
+
110
+ ## The Mistral AI Team
111
+
112
+ Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.`
113
+
114
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
115
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_uukuguy__speechless-mistral-dolphin-orca-platypus-samantha-7b)
116
+
117
+ | Metric | Value |
118
+ |-----------------------|---------------------------|
119
+ | Avg. | 53.34 |
120
+ | ARC (25-shot) | 64.33 |
121
+ | HellaSwag (10-shot) | 84.4 |
122
+ | MMLU (5-shot) | 63.72 |
123
+ | TruthfulQA (0-shot) | 52.52 |
124
+ | Winogrande (5-shot) | 78.37 |
125
+ | GSM8K (5-shot) | 21.38 |
126
+ | DROP (3-shot) | 8.66 |
added_tokens.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "</s>": 2,
3
+ "<s>": 1,
4
+ "<unk>": 0
5
+ }
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/opt/local/llm_models/huggingface.co/mistralai/Mistral-7B-v0.1",
3
+ "architectures": [
4
+ "MistralForCausalLM"
5
+ ],
6
+ "bos_token_id": 1,
7
+ "eos_token_id": 2,
8
+ "hidden_act": "silu",
9
+ "hidden_size": 4096,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 14336,
12
+ "max_position_embeddings": 32768,
13
+ "model_type": "mistral",
14
+ "num_attention_heads": 32,
15
+ "num_hidden_layers": 32,
16
+ "num_key_value_heads": 8,
17
+ "rms_norm_eps": 1e-05,
18
+ "rope_theta": 10000.0,
19
+ "sliding_window": 4096,
20
+ "tie_word_embeddings": false,
21
+ "torch_dtype": "bfloat16",
22
+ "transformers_version": "4.34.0",
23
+ "use_cache": true,
24
+ "vocab_size": 32000
25
+ }
model.safetensors.index.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"metadata": {}, "weight_map": {"model.embed_tokens.weight": "model-00001-of-00002.safetensors", "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.13.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.14.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.15.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.15.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.16.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.17.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.18.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.19.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.19.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.20.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.20.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.21.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.21.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.22.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.22.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.22.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.22.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.22.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.22.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.22.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.22.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.22.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.23.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.23.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.24.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.24.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.25.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.25.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.26.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.26.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.27.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.27.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.28.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.28.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.28.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.28.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.28.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.28.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.28.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.28.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.28.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.29.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.29.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.29.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.29.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.29.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.29.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.29.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.29.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.29.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.30.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.30.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.30.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.30.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.30.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.30.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.30.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.30.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.30.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.31.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.31.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.31.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.31.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.31.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.31.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.31.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.31.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.31.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.norm.weight": "model-00002-of-00002.safetensors", "lm_head.weight": "model-00002-of-00002.safetensors"}}
original_repo_url.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ https://huggingface.co/uukuguy/speechless-mistral-dolphin-orca-platypus-samantha-7b
output.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f58d7ff9c0712398d255f52a8bea68fe2f801f6a482a200a6b08f5c38ef04a01
3
+ size 4733427164
special_tokens_map.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<unk>",
4
+ "<s>",
5
+ "</s>"
6
+ ],
7
+ "bos_token": "<s>",
8
+ "eos_token": "</s>",
9
+ "unk_token": "<unk>"
10
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dadfd56d766715c61d2ef780a525ab43b8e6da4de6865bda3d95fdef5e134055
3
+ size 493443
tokenizer_config.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<unk>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<s>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ }
27
+ },
28
+ "additional_special_tokens": [
29
+ "<unk>",
30
+ "<s>",
31
+ "</s>"
32
+ ],
33
+ "bos_token": "<s>",
34
+ "clean_up_tokenization_spaces": false,
35
+ "eos_token": "</s>",
36
+ "legacy": true,
37
+ "model_max_length": 1000000000000000019884624838656,
38
+ "pad_token": null,
39
+ "sp_model_kwargs": {},
40
+ "spaces_between_special_tokens": false,
41
+ "tokenizer_class": "LlamaTokenizer",
42
+ "unk_token": "<unk>",
43
+ "use_default_system_prompt": true
44
+ }