File size: 3,252 Bytes
455acb7
 
fe4474e
 
 
 
 
 
 
455acb7
ac24db1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
license: apache-2.0
tags:
- merge
- mergekit
- bardsai/jaskier-7b-dpo-v5.6
- mlabonne/AlphaMonarch-7B
- mlabonne/NeuralMonarch-7B
- macadeliccc/MBX-7B-v3-DPO
---
# Pastiche-Crown-Clown-7B-dare

## Description
This repo contains GGUF format model files for Pastiche-Crown-Clown-7B-dare.

## Files Provided
|                   Name                    |  Quant  | Bits | File Size |              Remark              |
| ----------------------------------------- | ------- | ---- | --------- | -------------------------------- |
| pastiche-crown-clown-7b-dare.IQ3_XXS.gguf | IQ3_XXS |  3   |  3.02 GB  | 3.06 bpw quantization            |
| pastiche-crown-clown-7b-dare.IQ3_S.gguf   | IQ3_S   |  3   |  3.18 GB  | 3.44 bpw quantization            |
| pastiche-crown-clown-7b-dare.IQ3_M.gguf   | IQ3_M   |  3   |  3.28 GB  | 3.66 bpw quantization mix        |
| pastiche-crown-clown-7b-dare.Q4_0.gguf    | Q4_0    |  4   |  4.11 GB  | 3.56G, +0.2166 ppl               |
| pastiche-crown-clown-7b-dare.IQ4_NL.gguf  | IQ4_NL  |  4   |  4.16 GB  | 4.25 bpw non-linear quantization |
| pastiche-crown-clown-7b-dare.Q4_K_M.gguf  | Q4_K_M  |  4   |  4.37 GB  | 3.80G, +0.0532 ppl               |
| pastiche-crown-clown-7b-dare.Q5_K_M.gguf  | Q5_K_M  |  5   |  5.13 GB  | 4.45G, +0.0122 ppl               |
| pastiche-crown-clown-7b-dare.Q6_K.gguf    | Q6_K    |  6   |  5.94 GB  | 5.15G, +0.0008 ppl               |
| pastiche-crown-clown-7b-dare.Q8_0.gguf    | Q8_0    |  8   |  7.70 GB  | 6.70G, +0.0004 ppl               |

## Parameters
| path                                       | type    | architecture       | rope_theta | sliding_win | max_pos_embed |
| ------------------------------------------ | ------- | ------------------ | ---------- | ----------- | ------------- |
| CorticalStack/pastiche-crown-clown-7b-dare | mistral | MistralForCausalLM | 10000.0    | 4096        | 32768         |

## Benchmarks
![](https://i.ibb.co/Srwfpj3/pastiche-crown-clown-7b-dare.png)

# Original Model Card

<img src="pastiche-crown-clown.png" alt="Pastiche crown clown logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>

# pastiche-crown-clown-7B-dare

pastiche-crown-clown-7B-dare is a DARE merge of the following models using [mergekit](https://github.com/cg123/mergekit):
* [bardsai/jaskier-7b-dpo-v5.6](https://huggingface.co/bardsai/jaskier-7b-dpo-v5.6)
* [mlabonne/AlphaMonarch-7B](https://huggingface.co/mlabonne/AlphaMonarch-7B)
* [mlabonne/NeuralMonarch-7B](https://huggingface.co/mlabonne/NeuralMonarch-7B)
* [macadeliccc/MBX-7B-v3-DPO](https://huggingface.co/macadeliccc/MBX-7B-v3-DPO)

See the paper [Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch](https://arxiv.org/abs/2311.03099) for more on the method.

## 🧩 Configuration

```yaml
models:
  - model: bardsai/jaskier-7b-dpo-v5.6
  - model: mlabonne/AlphaMonarch-7B
    parameters:
      density: 0.53
      weight: 0.2
  - model: mlabonne/NeuralMonarch-7B
    parameters:
      density: 0.53
      weight: 0.4
  - model: macadeliccc/MBX-7B-v3-DPO
    parameters:
      density: 0.53
      weight: 0.4
merge_method: dare_ties
base_model: bardsai/jaskier-7b-dpo-v5.6
parameters:
  int8_mask: true
dtype: bfloat16
```