File size: 3,905 Bytes
cb32fad 738ebc7 c13a937 cb32fad 738ebc7 349e666 738ebc7 9ba548b 738ebc7 b1b98de 738ebc7 3697523 738ebc7 349e666 a45ef0d 349e666 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
---
library_name: transformers
tags:
- mergekit
- merge
license: llama2
---
![logo.png](logo.png)
# What is this
My experiment. Continuation of [Benchmaxxxer series](https://huggingface.co/ChuckMcSneed/BenchmaxxxerPS-v1-123b) (meme models), but a bit more serious. Performs high on my benchmark and on huggingface benchmark, moderately-high in practice. Worth trying? Yeah. It is on the **gooder** side.
# Observations
* GPTslop: medium-low. Avoid at all costs or it won't stop generating it though.
* Writing style: difficult to describe. Not the usual stuff. A bit of an autopilot like thing, if you write your usual lazy "ahh ahh mistress" it can give you a whole page of good text in return. High.
* Censorship: if you can handle Xwin, you can handle this model. Medium-high?
* Optimism: medium-low.
* Violence: medium-low.
* Intelligence: medium.
* Creativity: medium-high.
* Doesn't like high temperature. Keep below 1.5.
# Prompt format
Vicuna or Alpaca.
## Merge Details
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method.
### Models Merged
The following models were included in the merge:
* [WinterGoddess](https://huggingface.co/Sao10K/WinterGoddess-1.4x-70B-L2)
* [WizardLM](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)
* [Spicyboros](https://huggingface.co/jondurbin/spicyboros-70b-2.2)
* [Euryale](https://huggingface.co/Sao10K/Euryale-1.3-L2-70B)
* [Xwin](https://huggingface.co/Xwin-LM/Xwin-LM-70B-V0.1)
* [Dolphin](https://huggingface.co/cognitivecomputations/dolphin-2.2-70b)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
models:
- model: spicyboros
parameters:
weight: [0.093732305,0.403220342,0.055438423,0.043830778,0.054189303,0.081136828]
- model: xwin
parameters:
weight: [0.398943486,0.042069007,0.161586088,0.470977297,0.389315704,0.416739102]
- model: euryale
parameters:
weight: [0.061483013,0.079698633,0.043067724,0.00202751,0.132183868,0.36578003]
- model: dolphin
parameters:
weight: [0.427942847,0.391488452,0.442164138,0,0,0.002174793]
- model: wizard
parameters:
weight: [0.017898349,0.083523566,0.297743627,0.175345857,0.071770095,0.134169247]
- model: WinterGoddess
parameters:
weight: [0,0,0,0.30781856,0.352541031,0]
merge_method: linear
dtype: float16
tokenizer_source: base
```
# Benchmarks
### NeoEvalPlusN_benchmark
[My meme benchmark.](https://huggingface.co/datasets/ChuckMcSneed/NeoEvalPlusN_benchmark)
|Name |B |C |D |S |P |total|BCD|SP |
|-------------------------------------------|---|---|---|----|----|-----|---|-----|
|ChuckMcSneed/PMaxxxer-v1-70b |3 |1 |1 |6.75|4.75|16.5 |5 |11.5 |
|ChuckMcSneed/SMaxxxer-v1-70b |2 |1 |0 |7.25|4.25|14.5 |3 |11.5 |
|ChuckMcSneed/ArcaneEntanglement-model64-70b|3 |2 |1 |7.25|6 |19.25|6 |13.25|
Absurdly high. That's what happens when you optimize the merges for a benchmark.
### Open LLM Leaderboard Evaluation Results
[Leaderboard on Huggingface](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|Model |Average |ARC |HellaSwag|MMLU |TruthfulQA|Winogrande|GSM8K |
|-------------------------------------------|---------|---------|---------|---------|----------|----------|------|
|ChuckMcSneed/ArcaneEntanglement-model64-70b|**72.79**|**71.42**|87.96 |**70.83**|60.53 |**83.03** |**63**|
|ChuckMcSneed/PMaxxxer-v1-70b |72.41 |71.08 |87.88 |70.39 |59.77 |82.64 |62.7 |
|ChuckMcSneed/SMaxxxer-v1-70b |72.23 |70.65 |**88.02**|70.55 |**60.7** |82.87 |60.58 |
This model is simply superior to my other meme models here. |