LoneStriker
/

Air-Striker-Mixtral-8x7B-Instruct-ZLoss-3.5bpw-h6-exl2

Text Generation

text-generation-inference

Model card Files Files and versions Community

LoneStriker commited on Jan 22, 2024

Commit

cdda024

·

verified ·

1 Parent(s): 453523a

Update README.md

Files changed (1) hide show

README.md +22 -13

README.md CHANGED Viewed

@@ -1,11 +1,27 @@
 ---
-base_model: []
 tags:
 - mergekit
 - merge
 ---
-# airoboros-3.2-mixtral-zloss-merged
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
@@ -17,8 +33,8 @@ This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge
 ### Models Merged
 The following models were included in the merge:
-* /home/hien/models/Mixtral-8x7B-Instruct-v0.1
-* /home/hien/models/airoboros-3.2-mixtral-zloss
 ### Configuration
@@ -26,19 +42,12 @@ The following YAML configuration was used to produce this model:
 ```yaml
 models:
-  - model: /home/hien/models/Mixtral-8x7B-Instruct-v0.1
     parameters:
       weight: 0.5
-  - model: /home/hien/models/airoboros-3.2-mixtral-zloss
     parameters:
       weight: 0.5
 merge_method: linear
-#merge_method: dare_ties
-#base_model: ./extra_hdd/Mixtral-8x7B-v0.1
-parameters:
-  #normalize: false
-  #int8_mask: true
 dtype: bfloat16
 ```

 ---
+inference: false
+language:
+- en
+library_name: transformers
+pipeline_tag: text-generation
 tags:
+- mixtral
 - mergekit
 - merge
+license: apache-2.0
+datasets:
+- jondurbin/airoboros-3.2
+---
+# Air-Striker-Mixtral-8x7B-Instruct-ZLoss
+Experimental model, trained using config and [Transformers/Axolotl](https://github.com/DocShotgun/axolotl) forks provided by [Doctor-Shotgun](https://huggingface.co/Doctor-Shotgun)
+Model was fine-tuned from [Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) with airoboros-3.2 dataset, for 4 epochs, ChatML prompt format at 8K context length.
+Additionally, model was then merged with [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1):
 ---
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 ### Models Merged
 The following models were included in the merge:
+* mistralai/Mixtral-8x7B-Instruct-v0.1
+* LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss
 ### Configuration
 ```yaml
 models:
+  - model: mistralai/Mixtral-8x7B-Instruct-v0.1
     parameters:
       weight: 0.5
+  - model: LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss
     parameters:
       weight: 0.5
 merge_method: linear
 dtype: bfloat16
 ```