Text Generation
Transformers
Safetensors
English
mixtral
Generated from Trainer
axolotl
conversational
Inference Endpoints
text-generation-inference
8-bit precision
exl2
FuturisticVibes commited on
Commit
e1fe2bf
1 Parent(s): ede250c

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: mistral-community/Mixtral-8x22B-v0.1
4
+ tags:
5
+ - generated_from_trainer
6
+ - axolotl
7
+ model-index:
8
+ - name: out
9
+ results: []
10
+ datasets:
11
+ - cognitivecomputations/Dolphin-2.9.2
12
+ - cognitivecomputations/SystemChat-2.0
13
+ - teknium/OpenHermes-2.5
14
+ - m-a-p/CodeFeedback-Filtered-Instruction
15
+ - cognitivecomputations/dolphin-coder
16
+ - cognitivecomputations/samantha-data
17
+ - HuggingFaceH4/ultrachat_200k
18
+ - microsoft/orca-math-word-problems-200k
19
+ - abacusai/SystemChat-1.1
20
+ - Locutusque/function-calling-chatml
21
+ - internlm/Agent-FLAN
22
+ language:
23
+ - en
24
+ ---
25
+
26
+ # Dolphin 2.9.2 Mixtral 8x22b 🐬
27
+
28
+ Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations
29
+
30
+ [![Discord](https://img.shields.io/discord/1156064224225808488?logo=Discord&logoColor=%23ffffff&label=Discord&link=https%3A%2F%2Fdiscord.gg%2FtCMkMDDHwm)](https://discord.gg/cognitivecomputations)
31
+ Discord: https://discord.gg/cognitivecomputations
32
+
33
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png" width="600" />
34
+
35
+ New in 2.9.2 is SystemChat 2.0 - a dataset designed to teach Dolphin to obey the system prompt, even over a long conversation.
36
+
37
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/z1u6U91tL-H__7JCDbWys.png)
38
+
39
+ My appreciation for the sponsors of Dolphin 2.9.2:
40
+ - [Crusoe Cloud](https://crusoe.ai/) - provided excellent on-demand 8xH100 node
41
+ - [OnDemand](https://on-demand.io/) - provided inference sponsorship, enabling creation of SystemChat
42
+
43
+ This model is based on Dolphin-2.9-Mixtral-8x22b, and is Apache-2.0 licensed.
44
+
45
+ The base model has 64k context, and fine-tuning was with 16k sequence length.
46
+
47
+ It took 1 week on 8xH100 provided by Crusoe Cloud
48
+
49
+ This model was trained FFT on 50% parameters (targeted with [Laser Scanner](https://github.com/cognitivecomputations/laserRMT/blob/main/laser_scanner.py) by Fernando Fernandes, David Golchinfar, Lucas Atkins, and Eric Hartford), using ChatML prompt template format.
50
+
51
+ example:
52
+
53
+ ```
54
+ <|im_start|>system
55
+ You are Dolphin, a helpful AI assistant.<|im_end|>
56
+ <|im_start|>user
57
+ {prompt}<|im_end|>
58
+ <|im_start|>assistant
59
+
60
+ ```
61
+
62
+ Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling.
63
+
64
+ Dolphin is uncensored. I have filtered the dataset to remove alignment and bias. This makes the model more compliant. You are advised to implement your own alignment layer before exposing the model as a service. It will be highly compliant with any requests, even unethical ones. Please read my blog post about uncensored models. https://erichartford.com/uncensored-models You are responsible for any content you create using this model. Enjoy responsibly.
65
+
66
+ Dolphin is licensed Apache 2.0. I grant permission for any use, including commercial, that falls within accordance with Apache-2.0 license. Dolphin was trained on data generated from GPT4, among other models.
67
+
68
+ ## Evals
69
+
70
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/SDWV3SvJ8xR1gjl1z0LyO.png)
71
+
72
+ ## Training
added_tokens.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "<|im_end|>": 32000,
3
+ "<|im_start|>": 32001
4
+ }
config.json ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "mistralai/Mixtral-8x22B-v0.1",
3
+ "architectures": [
4
+ "MixtralForCausalLM"
5
+ ],
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 1,
8
+ "eos_token_id": 32000,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 6144,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 16384,
13
+ "max_position_embeddings": 65536,
14
+ "model_type": "mixtral",
15
+ "num_attention_heads": 48,
16
+ "num_experts_per_tok": 2,
17
+ "num_hidden_layers": 56,
18
+ "num_key_value_heads": 8,
19
+ "num_local_experts": 8,
20
+ "output_router_logits": false,
21
+ "rms_norm_eps": 1e-05,
22
+ "rope_theta": 1000000,
23
+ "router_aux_loss_coef": 0.001,
24
+ "router_jitter_noise": 0.0,
25
+ "sliding_window": null,
26
+ "tie_word_embeddings": false,
27
+ "torch_dtype": "bfloat16",
28
+ "transformers_version": "4.40.2",
29
+ "use_cache": false,
30
+ "vocab_size": 32002,
31
+ "quantization_config": {
32
+ "quant_method": "exl2",
33
+ "version": "0.1.4",
34
+ "bits": 8.0,
35
+ "head_bits": 8,
36
+ "calibration": {
37
+ "rows": 100,
38
+ "length": 2048,
39
+ "dataset": "(default)"
40
+ }
41
+ }
42
+ }
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "do_sample": true,
5
+ "eos_token_id": 2,
6
+ "transformers_version": "4.40.2"
7
+ }
latest ADDED
@@ -0,0 +1 @@
 
 
1
+ global_step1442
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
output-00001-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a23e6ac782c15f534233c161a79b41716bb5068832464f33293dc258b67a5bc1
3
+ size 8589156520
output-00002-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:984f86531e87e1bbb3938d1b0600e38891a50ab477685b7589173892288c51bf
3
+ size 8581602056
output-00003-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c74bb085fdb10d0aff08d6877f0a92d330bc867912e34b38b64b5e62092b9ba1
3
+ size 8584144192
output-00004-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8496446fcdebb5f9777494f23e0609ac969df7345d43f1bbf5eec1da24c532a7
3
+ size 8561270496
output-00005-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d91dc014416958b88358384997fac71fb78453faffb48ea73347e8512054a173
3
+ size 8535005840
output-00006-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1229cb486425cfe93a58c49d45509cfbe06b24aa0ae1ce972d28fa0c365fc85
3
+ size 8527043864
output-00007-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:889c627a47bb120b7b6bb5c1eb8afa85139b74f61dcba62872df216bae5cfe8c
3
+ size 8555143760
output-00008-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:365a9a50dee2c8bec9078b7a6d2b79c5ca0d3f7f8b8bca7cd4f3939a1bd8ae78
3
+ size 8578919464
output-00009-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3a3ce84e94ba641fb2cd8da46de7dc80c612bd1125d4f4f285e7d100447b2bff
3
+ size 8561328288
output-00010-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:538382b499dbcfa022f5ae2bc01b581993a1732746eb3db70d2f7b78bfa5ce86
3
+ size 8500892232
output-00011-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5139f6f11fda6b74b905501c3cf857b66e3afe2793c3dacf12fddc0c8c4b0ea6
3
+ size 8533540480
output-00012-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:533cae18f32ad07eb403dbf6579d7caf2893a0f020e4e0cdf801928f880aaa57
3
+ size 8581815720
output-00013-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ee7db16671ede9b4f9f800587bbce8b02b0ea39252136d007d1ac121cc1bba41
3
+ size 8512659416
output-00014-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6035c39e97bd6fdfba185fda96e4df33ea5be7334cec29e6b38e3cc9be411c67
3
+ size 8493247848
output-00015-of-00015.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f709a81f88402e840a97caf441f7166fcf9e32a40ff1784fe1a989f3e4f78b70
3
+ size 5308761728
special_tokens_map.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<|im_end|>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": "</s>",
17
+ "unk_token": {
18
+ "content": "<unk>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ }
24
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dadfd56d766715c61d2ef780a525ab43b8e6da4de6865bda3d95fdef5e134055
3
+ size 493443
tokenizer_config.json ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "added_tokens_decoder": {
5
+ "0": {
6
+ "content": "<unk>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "1": {
14
+ "content": "<s>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "2": {
22
+ "content": "</s>",
23
+ "lstrip": false,
24
+ "normalized": true,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "32000": {
30
+ "content": "<|im_end|>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ },
37
+ "32001": {
38
+ "content": "<|im_start|>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": false
44
+ }
45
+ },
46
+ "additional_special_tokens": [],
47
+ "bos_token": "<s>",
48
+ "chat_template": "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
49
+ "clean_up_tokenization_spaces": false,
50
+ "eos_token": "<|im_end|>",
51
+ "legacy": true,
52
+ "model_max_length": 1000000000000000019884624838656,
53
+ "pad_token": "</s>",
54
+ "sp_model_kwargs": {},
55
+ "spaces_between_special_tokens": false,
56
+ "tokenizer_class": "LlamaTokenizer",
57
+ "unk_token": "<unk>",
58
+ "use_default_system_prompt": false
59
+ }