|
--- |
|
base_model: |
|
- mistralai/Mixtral-8x7B-v0.1 |
|
- Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora |
|
- mistralai/Mixtral-8x7B-v0.1 |
|
- LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss-LoRA |
|
- rombodawg/Open_Gpt4_8x7B_v0.2 |
|
- mistralai/Mixtral-8x7B-Instruct-v0.1 |
|
- mistralai/Mixtral-8x7B-v0.1 |
|
- Sao10K/Typhon-Mixtral-v1 |
|
tags: |
|
- mergekit |
|
- merge |
|
license: cc-by-4.0 |
|
--- |
|
# mergeout |
|
|
|
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). |
|
|
|
## Merge Details |
|
### Merge Method |
|
|
|
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) as a base. |
|
|
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) + [Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora](https://huggingface.co/Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora) |
|
* [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) + [LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss-LoRA](https://huggingface.co/LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss-LoRA) |
|
* [rombodawg/Open_Gpt4_8x7B_v0.2](https://huggingface.co/rombodawg/Open_Gpt4_8x7B_v0.2) |
|
* [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) |
|
* [Sao10K/Typhon-Mixtral-v1](https://huggingface.co/Sao10K/Typhon-Mixtral-v1) |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
models: |
|
- model: mistralai/Mixtral-8x7B-Instruct-v0.1 |
|
parameters: |
|
density: 0.6 |
|
weight: 1.0 |
|
- model: rombodawg/Open_Gpt4_8x7B_v0.2 |
|
parameters: |
|
density: 0.5 |
|
weight: 0.8 |
|
- model: mistralai/Mixtral-8x7B-v0.1+LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss-LoRA |
|
parameters: |
|
density: 0.5 |
|
weight: 0.6 |
|
- model: Sao10K/Typhon-Mixtral-v1 |
|
parameters: |
|
density: 0.5 |
|
weight: 0.7 |
|
- model: mistralai/Mixtral-8x7B-v0.1+Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora |
|
parameters: |
|
density: 0.5 |
|
weight: 0.4 |
|
merge_method: dare_ties |
|
base_model: mistralai/Mixtral-8x7B-v0.1 |
|
parameters: |
|
normalize: true |
|
int8_mask: true |
|
dtype: bfloat16 |
|
name: Mega-Destroyer-8x7B |
|
|
|
``` |
|
High quality GGUF quants available here: https://huggingface.co/Artefact2/Mega-Destroyer-8x7B-GGUF (Thank you, Artefact for quanting it using an imatrix!) |
|
|
|
Hello everyone, this is Dampf. You might know me as the creator of Mythical-Destroyer-13B. |
|
|
|
This time, I collaborated with Mr.DragonFox aka FoxEngineAi, harnessing his powerful rig to deliver a Merge of multiple high quality Mixtral 8x7B models. My goal was to beat Bagel-Mistery-Tour V2 by Ycros and create the best Mixtral model to date. Did I succeed? Please try it out and decide for yourself! |
|
|
|
Aside from the obvious Mixtral Instruct, to keep its intelligence, I've merged Rombo's excellent Open_Gpt4_v0.2 model that consists of Jon Durbin's Bagel-DPO-8x7B and another highly regarded model, namely smelborp/MixtralOrochi8x7B. This model also combines different datasets together, meaning it should be agood fit for every task you throw at it. This model acts like the reasoning part in the merge. |
|
In contrast, we have Air-Striker and LimaRP at the creative side which will allow for great roleplays in different styles, they are also a good fit to enhance the model's writing capabilities greatly. |
|
|
|
And finally, I've merged Sao10K/Typhon-Mixtral-v1 to boost the story writing capabilities even further. It includes KoboldAI's latest Holodeck model, as well as a couple of his latest models and combines it into one package. My hope is that this will capture the magic Sao10K/Fimbulvetr-11B-v2 emits, just at the intelligence level of a Mixtral model. This one also includes Nous Hermes 2 DPO, a high quality instruct model that will boost its intelligence and sorta act like a balancer to all the creative stuff in the merge. |
|
|
|
What we have here is a model that should be fantastic at instruct and roleplay/creative tasks a like. So basically a general purpose model. Perhaps the pinnacle of Rocksmashing? Idk xD I just know it includes nearly all datasets on the sun. As a reason, it will likely work with every prompt format as well. So feel free to use Alpaca, Vicuna, ChatML, Llama 2 Chat or whatever your heart desires. |
|
|
|
A huge thank you to the creators of these fantastic datasets and fine tunes in the respective merges, namely Jon Durbin, Teknium, Sao10K, MistralAI, LoneStriker, NeverSleep, Suikamelon, Doctor-Shotgun, KoboldAI and more. All credit goes to them. A thank you to the creators of the different merges I've merged (Mergeception!) as well! And of course a thank you to MrDragonFox for lending his compute! Please enjoy :D |