R136a1 commited on
Commit
0a3a492
1 Parent(s): 9475854

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -0
README.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - nsfw
7
+ - not-for-all-audiences
8
+ - roleplay
9
+ ---
10
+
11
+ ## InfinityKumon-2x7B
12
+
13
+ ![InfinityKumon-2x7B](https://cdn.discordapp.com/attachments/843160171676565508/1222560876578476103/00000-3033963009.png?ex=6616a98b&is=6604348b&hm=6434f8a16f22a3515728ab38bf7230a01448b00e6136729d42d75ae0374e5802&)
14
+
15
+ GGUF - Imatrix quant of [InfinityKumon-2x7B](https://huggingface.co/R136a1/InfinityKumon-2x7B)
16
+
17
+ Another MoE merge from [Endevor/InfinityRP-v1-7B](https://huggingface.co/Endevor/InfinityRP-v1-7B) and [grimjim/kukulemon-7B](https://huggingface.co/grimjim/kukulemon-7B).
18
+
19
+ The reason? Because I like InfinityRP-v1-7B so much and wondering if I can improve it even more by merging 2 great models into MoE.
20
+
21
+
22
+ ## Perplexity
23
+
24
+ Using llama.cpp/perplexity with private roleplay dataset.
25
+
26
+ | Format | PPL |
27
+ | --- | --- |
28
+ | FP16 | 3.1748 +/- 0.11928 |
29
+ | Q8_0 | 3.1734 +/- 0.11935 |
30
+ | Q6_K | 3.1752 +/- 0.11899 |
31
+ | Q5_K_M | 3.1731 +/- 0.11892 |
32
+ | IQ4_NL | 3.1752 +/- 0.11943 |
33
+ | IQ3_M | 3.1773 +/- 0.11528 |
34
+ | Q2_K | 3.2309 +/- 0.11996 |
35
+
36
+ I don't really recomend using Q2_K based on the ppl, the other quants are fine.
37
+
38
+ ### Prompt format:
39
+ Alpaca or ChatML
40
+
41
+ Switch: [FP16](https://huggingface.co/R136a1/InfinityKumon-2x7B) - [GGUF](https://huggingface.co/R136a1/InfinityKumon-2x7B-GGUF)