R136a1's picture
Create README.md
0a3a492 verified
metadata
license: apache-2.0
language:
  - en
tags:
  - nsfw
  - not-for-all-audiences
  - roleplay

InfinityKumon-2x7B

InfinityKumon-2x7B

GGUF - Imatrix quant of InfinityKumon-2x7B

Another MoE merge from Endevor/InfinityRP-v1-7B and grimjim/kukulemon-7B.

The reason? Because I like InfinityRP-v1-7B so much and wondering if I can improve it even more by merging 2 great models into MoE.

Perplexity

Using llama.cpp/perplexity with private roleplay dataset.

Format PPL
FP16 3.1748 +/- 0.11928
Q8_0 3.1734 +/- 0.11935
Q6_K 3.1752 +/- 0.11899
Q5_K_M 3.1731 +/- 0.11892
IQ4_NL 3.1752 +/- 0.11943
IQ3_M 3.1773 +/- 0.11528
Q2_K 3.2309 +/- 0.11996

I don't really recomend using Q2_K based on the ppl, the other quants are fine.

Prompt format:

Alpaca or ChatML

Switch: FP16 - GGUF