File size: 3,905 Bytes
cb32fad
738ebc7
 
 
 
c13a937
cb32fad
738ebc7
 
 
 
349e666
738ebc7
 
9ba548b
 
738ebc7
b1b98de
 
 
 
738ebc7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3697523
 
 
 
 
738ebc7
349e666
 
a45ef0d
349e666
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
library_name: transformers
tags:
- mergekit
- merge
license: llama2
---

![logo.png](logo.png)

# What is this
My experiment. Continuation of [Benchmaxxxer series](https://huggingface.co/ChuckMcSneed/BenchmaxxxerPS-v1-123b) (meme models), but a bit more serious. Performs high on my benchmark and on huggingface benchmark, moderately-high in practice. Worth trying? Yeah. It is on the **gooder** side.

# Observations
* GPTslop: medium-low. Avoid at all costs or it won't stop generating it though.
* Writing style: difficult to describe. Not the usual stuff. A bit of an autopilot like thing, if you write your usual lazy "ahh ahh mistress" it can give you a whole page of good text in return. High.
* Censorship: if you can handle Xwin, you can handle this model. Medium-high?
* Optimism: medium-low.
* Violence: medium-low.
* Intelligence: medium.
* Creativity: medium-high.
* Doesn't like high temperature. Keep below 1.5.

# Prompt format
Vicuna or Alpaca.

## Merge Details
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method.

### Models Merged

The following models were included in the merge:
* [WinterGoddess](https://huggingface.co/Sao10K/WinterGoddess-1.4x-70B-L2)
* [WizardLM](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)
* [Spicyboros](https://huggingface.co/jondurbin/spicyboros-70b-2.2)
* [Euryale](https://huggingface.co/Sao10K/Euryale-1.3-L2-70B)
* [Xwin](https://huggingface.co/Xwin-LM/Xwin-LM-70B-V0.1)
* [Dolphin](https://huggingface.co/cognitivecomputations/dolphin-2.2-70b)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
models:
  - model: spicyboros
    parameters:
      weight: [0.093732305,0.403220342,0.055438423,0.043830778,0.054189303,0.081136828]
  - model: xwin
    parameters:
      weight: [0.398943486,0.042069007,0.161586088,0.470977297,0.389315704,0.416739102]
  - model: euryale
    parameters:
      weight: [0.061483013,0.079698633,0.043067724,0.00202751,0.132183868,0.36578003]
  - model: dolphin
    parameters:
      weight: [0.427942847,0.391488452,0.442164138,0,0,0.002174793]
  - model: wizard
    parameters:
      weight: [0.017898349,0.083523566,0.297743627,0.175345857,0.071770095,0.134169247]
  - model: WinterGoddess
    parameters:
      weight: [0,0,0,0.30781856,0.352541031,0]
merge_method: linear
dtype: float16
tokenizer_source: base
```

# Benchmarks
### NeoEvalPlusN_benchmark
[My meme benchmark.](https://huggingface.co/datasets/ChuckMcSneed/NeoEvalPlusN_benchmark)
|Name                                       |B  |C  |D  |S   |P   |total|BCD|SP   |
|-------------------------------------------|---|---|---|----|----|-----|---|-----|
|ChuckMcSneed/PMaxxxer-v1-70b               |3  |1  |1  |6.75|4.75|16.5 |5  |11.5 |
|ChuckMcSneed/SMaxxxer-v1-70b               |2  |1  |0  |7.25|4.25|14.5 |3  |11.5 |
|ChuckMcSneed/ArcaneEntanglement-model64-70b|3  |2  |1  |7.25|6   |19.25|6  |13.25|

Absurdly high. That's what happens when you optimize the merges for a benchmark.

### Open LLM Leaderboard Evaluation Results
[Leaderboard on Huggingface](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|Model                                      |Average  |ARC      |HellaSwag|MMLU     |TruthfulQA|Winogrande|GSM8K |
|-------------------------------------------|---------|---------|---------|---------|----------|----------|------|
|ChuckMcSneed/ArcaneEntanglement-model64-70b|**72.79**|**71.42**|87.96    |**70.83**|60.53     |**83.03** |**63**|
|ChuckMcSneed/PMaxxxer-v1-70b               |72.41    |71.08    |87.88    |70.39    |59.77     |82.64     |62.7  |
|ChuckMcSneed/SMaxxxer-v1-70b               |72.23    |70.65    |**88.02**|70.55    |**60.7**  |82.87     |60.58 |

This model is simply superior to my other meme models here.