Text Generation
Transformers
Safetensors
English
mixtral
Generated from Trainer
axolotl
conversational
text-generation-inference
Inference Endpoints
Crystalcareai commited on
Commit
97af64b
1 Parent(s): cb7b401

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -8
README.md CHANGED
@@ -1,14 +1,55 @@
1
  ---
2
- base_model: cognitivecomputations/mixtral-1x22b-base
 
3
  tags:
4
  - generated_from_trainer
 
5
  model-index:
6
- - name: 1x22b-out
7
  results: []
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
- should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
14
  <details><summary>See axolotl config</summary>
@@ -380,10 +421,6 @@ tokens:
380
 
381
  # 1x22b-out
382
 
383
- This model is a fine-tuned version of [cognitivecomputations/mixtral-1x22b-base](https://huggingface.co/cognitivecomputations/mixtral-1x22b-base) on the None dataset.
384
- It achieves the following results on the evaluation set:
385
- - Loss: 0.4572
386
-
387
  ## Model description
388
 
389
  More information needed
 
1
  ---
2
+ license: apache-2.0
3
+ base_model: mistral-community/Mixtral-8x22B-v0.1
4
  tags:
5
  - generated_from_trainer
6
+ - axolotl
7
  model-index:
8
+ - name: out
9
  results: []
10
+ datasets:
11
+ - cognitivecomputations/Dolphin-2.9
12
+ - teknium/OpenHermes-2.5
13
+ - m-a-p/CodeFeedback-Filtered-Instruction
14
+ - cognitivecomputations/dolphin-coder
15
+ - cognitivecomputations/samantha-data
16
+ - microsoft/orca-math-word-problems-200k
17
+ - abacusai/SystemChat-1.1
18
+ - Locutusque/function-calling-chatml
19
+ - internlm/Agent-FLAN
20
+ language:
21
+ - en
22
  ---
23
 
24
+ # Dolphin 2.9 Mixtral 8x22b 🐬
25
+
26
+ Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations
27
+
28
+ [![Discord](https://img.shields.io/discord/1156064224225808488?logo=Discord&logoColor=%23ffffff&label=Discord&link=https%3A%2F%2Fdiscord.gg%2FtCMkMDDHwm)](https://discord.gg/cognitivecomputations)
29
+ Discord: https://discord.gg/cognitivecomputations
30
+
31
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png" width="600" />
32
+
33
+ This model is based on Dolphin-2.9-Mixtral-8x22b, and is Apache-2.0 licensed.
34
+
35
+ The base model has 64k context, and the full-weight fine-tuning was with 16k sequence length.
36
+
37
+ It took 27 hours on 8xH100 provided by Crusoe Cloud.
38
+
39
+ This model was fully fine-tuned, targeting all layers.
40
+
41
+ The model is an extracted expert using SLERP and a custom script that we've open-sourced (I'll provide the link to the GitHub). It extracts a single expert which is the combined SLERP of all 8 experts from a Mixtral architecture. We decided to not fully convert to a dense model, for the sake of trying to keep as much of the original model's performance as possible, as this process is already quite surgical and there are a lot of variables to take into account.
42
+
43
+ Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling.
44
+
45
+ Dolphin is uncensored. I have filtered the dataset to remove alignment and bias. This makes the model more compliant. You are advised to implement your own alignment layer before exposing the model as a service. It will be highly compliant with any requests, even unethical ones. Please read my blog post about uncensored models. https://erichartford.com/uncensored-models You are responsible for any content you create using this model. Enjoy responsibly.
46
+
47
+ Dolphin is licensed Apache 2.0. I grant permission for any use, including commercial, that falls within accordance with Apache-2.0 license. Dolphin was trained on data generated from GPT4, among other models.
48
+
49
+ ## Evals
50
+
51
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/Nb6f_dS_M6fN_v2ACK98x.png)
52
+
53
 
54
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
55
  <details><summary>See axolotl config</summary>
 
421
 
422
  # 1x22b-out
423
 
 
 
 
 
424
  ## Model description
425
 
426
  More information needed