Text Generation
GGUF
English
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
science fiction
romance
all genres
story
writing
vivid prose
vivid writing
fiction
roleplaying
bfloat16
swearing
rp
llama3
enhanced quants
max quants
maxcpu quants
horror
mergekit
Inference Endpoints
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -34,14 +34,16 @@ tags:
|
|
34 |
pipeline_tag: text-generation
|
35 |
---
|
36 |
|
37 |
-
<B>Updates Dec 21 2024: (uploading quants ... refreshed, and new quants):</B>
|
38 |
- All quants have been "refreshed", quanted with the lastest LLAMACPP improvements : Better instruction following, output generation across all quants.
|
39 |
- All quants have also been upgraded with "more bits" for output tensor and embed for better performance (this is in addition to the "refresh")
|
|
|
40 |
- New "ARM" quants have been added for machines than can run them. (format: ".../Q4_0_4_4.gguf")
|
41 |
-
- New specialized quants (in addition to
|
42 |
- "MAX": output tensor / embed at float 16. (better instruction following/output generation than standard quants)
|
43 |
- "MAX-CPU": output tensor / embed at bfloat 16, which forces these on to the CPU (Nvidia cards / other will vary), this frees up vram at cost of token/second and you get better instruction following/output generation too.
|
44 |
-
|
|
|
45 |
<h2>L3-Dark-Planet-8B-GGUF</h2>
|
46 |
|
47 |
<img src="dark-planet.jpg" style="float:right; width:300px; height:300px; padding:10px;">
|
|
|
34 |
pipeline_tag: text-generation
|
35 |
---
|
36 |
|
37 |
+
<B>L3-Dark-Planet-8B-GGUF - Updates Dec 21 2024: (uploading quants ... refreshed, and new quants):</B>
|
38 |
- All quants have been "refreshed", quanted with the lastest LLAMACPP improvements : Better instruction following, output generation across all quants.
|
39 |
- All quants have also been upgraded with "more bits" for output tensor and embed for better performance (this is in addition to the "refresh")
|
40 |
+
- All quants (including new "ARM" quants) the output tensor is set at Q8_0. Embed has also been upgraded.
|
41 |
- New "ARM" quants have been added for machines than can run them. (format: ".../Q4_0_4_4.gguf")
|
42 |
+
- New specialized quants (in addition to the new refresh/upgrades): "max, max-cpu" (will include this in the file name) for quants "Q2K" (max cpu only), "IQ4_XS", "Q6_K" and "Q8_0"
|
43 |
- "MAX": output tensor / embed at float 16. (better instruction following/output generation than standard quants)
|
44 |
- "MAX-CPU": output tensor / embed at bfloat 16, which forces these on to the CPU (Nvidia cards / other will vary), this frees up vram at cost of token/second and you get better instruction following/output generation too.
|
45 |
+
- Q8_0 (Max,Max-CPU) now clocks in at almost 10 bits (average).
|
46 |
+
-
|
47 |
<h2>L3-Dark-Planet-8B-GGUF</h2>
|
48 |
|
49 |
<img src="dark-planet.jpg" style="float:right; width:300px; height:300px; padding:10px;">
|