File size: 2,561 Bytes
235e7d3
 
0c18824
235e7d3
 
 
 
91774f6
235e7d3
 
 
 
 
 
 
ddf173d
 
235e7d3
 
 
 
91774f6
235e7d3
91774f6
 
235e7d3
 
 
 
 
91774f6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
license: mit
pipeline_tag: text-generation
---
Imatrix compressions of FP Merge of "D_AU-Mistral-7B-Instruct-v0.2-Bagel-DarkSapling-DPO-7B-v2.0".

"Imatrix Plus" is an upgraded form of Imatrix which using full precision for specific parts of the compression. 
As a result all compressions will be slightly larger in size than standard 7B compressions.

This method results in a higher quality model, especially at lower compressions.
This method is applied across all compressions from IQ1 to Q8.

Even IQ1_S - the most compressed verison - works well, however IQ4/Q4 are suggested as minimums for quality.
Highest quality will be Q6/Q8.

Q8 Imatrix Plus quality will exceed standard Q8 and Regular Imatrix Q8.

This merge was an experiment to test already established Roleplay, Fiction and Story 
generation of "DarkSapling" with a some of "Bagel"'s qualities with a Mistral Instruct Base.

For Imatrix plus this was a test of high precision in specific areas of the model leading to a slightly larger compressed file.
In addition the Imatrix process itself used a larger "calibration" file than standard was used to further enhance quality.

The process added appoximately 250 MB to each compressed file.
An additional enhancement added another 250 mb to each compressed file.

A blank or standard Alpaca Template for text generation will work.

Context length: 32768.

Please see the orginal model card for specific details of use, additional credits and tips under "Models Merged" below.

# merge

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

## Merge Details
### Merge Method

This model was merged using the SLERP merge method.

### Models Merged

The following models were included in the merge:
* [TeeZee/DarkSapling-7B-v2.0](https://huggingface.co/TeeZee/DarkSapling-7B-v2.0)
* [MaziyarPanahi/bagel-dpo-7b-v0.1-Mistral-7B-Instruct-v0.2-slerp](https://huggingface.co/MaziyarPanahi/bagel-dpo-7b-v0.1-Mistral-7B-Instruct-v0.2-slerp)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
slices:
  - sources:
      - model: MaziyarPanahi/bagel-dpo-7b-v0.1-Mistral-7B-Instruct-v0.2-slerp
        layer_range: [0, 32]
      - model: TeeZee/DarkSapling-7B-v2.0
        layer_range: [0, 32]
merge_method: slerp
base_model: MaziyarPanahi/bagel-dpo-7b-v0.1-Mistral-7B-Instruct-v0.2-slerp
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

```