DavidAU commited on
Commit
32503af
1 Parent(s): c8219b5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -0
README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - story
7
+ - general usage
8
+ - roleplay
9
+ - creative
10
+ - rp
11
+ - fantasy
12
+ - story telling
13
+ - ultra high precision
14
+ ---
15
+ <B>NEO CLASS Ultra Quants for : Daredevil-8B-abliterated-Ultra </B>
16
+
17
+ The NEO Class tech was created after countless investigations and over 120 lab experiments backed by
18
+ real world testing and qualitative results.
19
+
20
+ <b>NEO Class results: </b>
21
+
22
+ Better overall function, instruction following, output quality and stronger connections to ideas, concepts and the world in general.
23
+
24
+ In addition quants now operate above their "grade" so to speak :
25
+
26
+ IE: Q4 / IQ4 operate at Q5KM/Q6 levels.
27
+
28
+ Likewise for Q3/IQ3 operate at Q4KM/Q5 levels.
29
+
30
+ Perplexity drop of 724 points for Neo Class Imatrix quant of IQ4XS VS regular quant of IQ4XS.
31
+
32
+ (lower is better)
33
+
34
+ <B> A Funny thing happened on the way to the "lab" ... </b>
35
+
36
+ Although this model uses a "Llama3" template we found that Command-R's template worked better specifically for creative purposes.
37
+
38
+ This applies to both normal quants and Neo quants.
39
+
40
+ Here is Command-R's template:
41
+
42
+ <PRE>
43
+ {
44
+ "name": "Cohere Command R",
45
+ "inference_params": {
46
+ "input_prefix": "<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>",
47
+ "input_suffix": "<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>",
48
+ "antiprompt": [
49
+ "<|START_OF_TURN_TOKEN|>",
50
+ "<|END_OF_TURN_TOKEN|>"
51
+ ],
52
+ "pre_prompt_prefix": "<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>",
53
+ "pre_prompt_suffix": ""
54
+ }
55
+ }
56
+
57
+ </PRE>
58
+
59
+ This was "interesting" issue was confirmed by multiple users.
60
+
61
+ <B> Model Notes: </B>
62
+
63
+ Maximum context is 32k. Please see original model maker's page for details, and usage information for this model.
64
+
65
+ Special thanks to the model creators at MLABONNE for making such a fantastic model:
66
+
67
+ [ https://huggingface.co/mlabonne/Daredevil-8B-abliterated ]
68
+
69
+ <h3> Sample Prompt and Model's Compared:</h3>
70
+
71
+ Prompt tested with "temp=0" to ensure compliance, 2048 context (model supports 31768 context / 32k), and "chat" template for LLAMA3.
72
+
73
+ Additional parameters are also minimized.
74
+
75
+ PROMPT: <font color="red">"Start a 1000 word scene with: The sky scraper swayed, as she watched the window in front of her on the 21 floor explode..."</font>
76
+
77
+ <B> Original model IQ4XS - unaltered: </b>
78
+
79
+
80
+ <b>New NEO Class IQ4XS Imatrix: </b>
81
+
82
+