metadata

library_name: transformers
tags:
  - mergekit
  - merge
license: llama2

What is this

My experiment. Continuation of Benchmaxxxer series (meme models), but a bit more serious. Performs high on my benchmark and on huggingface benchmark, moderately-high in practice. Worth trying? Yeah. It is on the gooder side.

Observations

GPTslop: medium-low. Avoid at all costs or it won't stop generating it though.
Writing style: difficult to describe. Not the usual stuff. A bit of an autopilot like thing, if you write your usual lazy "ahh ahh mistress" it can give you a whole page of good text in return. High.
Censorship: if you can handle Xwin, you can handle this model. Medium-high?
Optimism: medium-low.
Violence: medium-low.
Intelligence: medium.
Creativity: medium-high.
Doesn't like high temperature. Keep below 1.5.

Prompt format

Vicuna or Alpaca.

Merge Details

This is a merge of pre-trained language models created using mergekit.

This model was merged using the linear merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: spicyboros
    parameters:
      weight: [0.093732305,0.403220342,0.055438423,0.043830778,0.054189303,0.081136828]
  - model: xwin
    parameters:
      weight: [0.398943486,0.042069007,0.161586088,0.470977297,0.389315704,0.416739102]
  - model: euryale
    parameters:
      weight: [0.061483013,0.079698633,0.043067724,0.00202751,0.132183868,0.36578003]
  - model: dolphin
    parameters:
      weight: [0.427942847,0.391488452,0.442164138,0,0,0.002174793]
  - model: wizard
    parameters:
      weight: [0.017898349,0.083523566,0.297743627,0.175345857,0.071770095,0.134169247]
  - model: WinterGoddess
    parameters:
      weight: [0,0,0,0.30781856,0.352541031,0]
merge_method: linear
dtype: float16
tokenizer_source: base

Benchmarks

NeoEvalPlusN_benchmark

My meme benchmark.

Name	B	C	D	S	P	total	BCD	SP
ChuckMcSneed/PMaxxxer-v1-70b	3	1	1	6.75	4.75	16.5	5	11.5
ChuckMcSneed/SMaxxxer-v1-70b	2	1	0	7.25	4.25	14.5	3	11.5
ChuckMcSneed/ArcaneEntanglement-model64-70b	3	2	1	7.25	6	19.25	6	13.25

Absurdly high. That's what happens when you optimize the merges for a benchmark.

Open LLM Leaderboard Evaluation Results

Leaderboard on Huggingface

Model	Average	ARC	HellaSwag	MMLU	TruthfulQA	Winogrande	GSM8K
ChuckMcSneed/ArcaneEntanglement-model64-70b	72.79	71.42	87.96	70.83	60.53	83.03	63
ChuckMcSneed/PMaxxxer-v1-70b	72.41	71.08	87.88	70.39	59.77	82.64	62.7
ChuckMcSneed/SMaxxxer-v1-70b	72.23	70.65	88.02	70.55	60.7	82.87	60.58

This model is simply superior to my other meme models here.