DavidAU
/

D_AU-Mistral-7B-Instruct-v0.2-Bagel-DarkSapling-DPO-7B-v2.0-imat-plus-GGUF

Text Generation

GGUF

Inference Endpoints

conversational

Model card Files Files and versions Community

DavidAU commited on May 3, 2024

Commit

91774f6

verified ·

1 Parent(s): 235e7d3

Update README.md

Browse files

Files changed (1) hide show

README.md +44 -8

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ license: mit
 Imatrix compressions of FP Merge of "D_AU-Mistral-7B-Instruct-v0.2-Bagel-DarkSapling-DPO-7B-v2.0".
 "Imatrix Plus" is an upgraded form of Imatrix which using full precision for specific parts of the compression.
-As a result all compressions will be slightly larger in size than standard 13B compressions.
 This method results in a higher quality model, especially at lower compressions.
 This method is applied across all compressions from IQ1 to Q8.
@@ -12,20 +12,56 @@ This method is applied across all compressions from IQ1 to Q8.
 Even IQ1_S - the most compressed verison - works well, however IQ4/Q4 are suggested as minimums for quality.
 Highest quality will be Q6/Q8.
-In addition the Imatrix file used to "fix" the compressed files post compression resulted in
-over 2 whole points lower perplexity at IQ1_S vs some of the other "Imatrix" files currently in use.
 This merge was an experiment to test already established Roleplay, Fiction and Story
 generation of "DarkSapling" with a some of "Bagel"'s qualities with a Mistral Instruct Base.
 For Imatrix plus this was a test of high precision in specific areas of the model leading to a slightly larger compressed file.
-In addition the Imatrix process itself used a larger "calibration" file than standard to further enhance quality.
-The process added appoximately 310 MB to each compressed file.
-An additional enhancement added another 200 mb to each compressed file.
 A blank or standard Alpaca Template for text generation will work.
 Context length: 32768.
-Please see the orginal model card for specific details of use, additional credits and tips:

 Imatrix compressions of FP Merge of "D_AU-Mistral-7B-Instruct-v0.2-Bagel-DarkSapling-DPO-7B-v2.0".
 "Imatrix Plus" is an upgraded form of Imatrix which using full precision for specific parts of the compression.
+As a result all compressions will be slightly larger in size than standard 7B compressions.
 This method results in a higher quality model, especially at lower compressions.
 This method is applied across all compressions from IQ1 to Q8.
 Even IQ1_S - the most compressed verison - works well, however IQ4/Q4 are suggested as minimums for quality.
 Highest quality will be Q6/Q8.
 This merge was an experiment to test already established Roleplay, Fiction and Story
 generation of "DarkSapling" with a some of "Bagel"'s qualities with a Mistral Instruct Base.
 For Imatrix plus this was a test of high precision in specific areas of the model leading to a slightly larger compressed file.
+In addition the Imatrix process itself used a larger "calibration" file than standard was used to further enhance quality.
+The process added appoximately 250 MB to each compressed file.
+An additional enhancement added another 250 mb to each compressed file.
 A blank or standard Alpaca Template for text generation will work.
 Context length: 32768.
+Please see the orginal model card for specific details of use, additional credits and tips under "Models Merged" below.
+# merge
+This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
+## Merge Details
+### Merge Method
+This model was merged using the SLERP merge method.
+### Models Merged
+The following models were included in the merge:
+* [TeeZee/DarkSapling-7B-v2.0](https://huggingface.co/TeeZee/DarkSapling-7B-v2.0)
+* [MaziyarPanahi/bagel-dpo-7b-v0.1-Mistral-7B-Instruct-v0.2-slerp](https://huggingface.co/MaziyarPanahi/bagel-dpo-7b-v0.1-Mistral-7B-Instruct-v0.2-slerp)
+### Configuration
+The following YAML configuration was used to produce this model:
+```yaml
+slices:
+  - sources:
+      - model: MaziyarPanahi/bagel-dpo-7b-v0.1-Mistral-7B-Instruct-v0.2-slerp
+        layer_range: [0, 32]
+      - model: TeeZee/DarkSapling-7B-v2.0
+        layer_range: [0, 32]
+merge_method: slerp
+base_model: MaziyarPanahi/bagel-dpo-7b-v0.1-Mistral-7B-Instruct-v0.2-slerp
+parameters:
+  t:
+    - filter: self_attn
+      value: [0, 0.5, 0.3, 0.7, 1]
+    - filter: mlp
+      value: [1, 0.5, 0.7, 0.3, 0]
+    - value: 0.5
+dtype: bfloat16
+```