parameters guide
samplers guide
model generation
role play settings
quant selection
arm quants
iq quants vs q quants
optimal model setting
gibberish fixes
coherence
instructing following
quality generation
chat settings
quality settings
llamacpp server
llamacpp
lmstudio
sillytavern
koboldcpp
backyard
ollama
model generation steering
steering
model generation fixes
text generation webui
ggufs
exl2
full precision
quants
imatrix
neo imatrix
Update README.md
Browse files
README.md
CHANGED
@@ -62,10 +62,11 @@ Likewise because of how modern AIs/LLMs operate the previously generated (qualit
|
|
62 |
|
63 |
You will get higher quality operation overall - stronger prose, better answers, and a higher quality adventure.
|
64 |
|
|
|
65 |
|
66 |
-
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
67 |
PARAMETERS AND SAMPLERS
|
68 |
-
|
|
|
69 |
|
70 |
Primary Testing Parameters I use, including use for output generation examples at my repo:
|
71 |
|
@@ -134,7 +135,7 @@ OTHER PROGRAMS:
|
|
134 |
|
135 |
Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
|
136 |
|
137 |
-
In most cases all llama_cpp
|
138 |
|
139 |
You can also use llama_cpp directly too. (IE: llama-server.exe) ; see :
|
140 |
|
@@ -201,11 +202,13 @@ and/or want to use these models for other than intended use case(s) and that is
|
|
201 |
|
202 |
The goal here is to use parameters to raise/lower the power of the model and samplers to "prune" (and/or in some cases enhance) operation.
|
203 |
|
204 |
-
With that being said, generation "examples" (at my repo) are created using the "Primary Testing Parameters" (top of this document) settings regardless of the "class" of the model
|
205 |
|
206 |
However, for ANY model regardless of "class" or if it is at my repo, you can now take performance to the next level with the information contained in this document.
|
207 |
|
208 |
-
Side note:
|
|
|
|
|
209 |
|
210 |
---
|
211 |
|
@@ -273,9 +276,11 @@ This will help the model fine tune your prompt so IT understands it.
|
|
273 |
However sometimes parameters and/or samplers are required to better "wrangle" the model and getting to perform to its maximum potential and/or fine tune it to your use case(s).
|
274 |
|
275 |
|
276 |
-
|
|
|
277 |
Section 1a : PRIMARY PARAMETERS - ALL APPS:
|
278 |
-
|
|
|
279 |
|
280 |
These parameters will have SIGNIFICANT effect on prose, generation, length and content; with temp being the most powerful.
|
281 |
|
@@ -346,9 +351,11 @@ Another option is testing different models (at temp=0 AND of the same quant) to
|
|
346 |
Then test "at temp" with your prompt(s) to see the MODELS in action. (5-10 generations recommended)
|
347 |
|
348 |
|
349 |
-
|
|
|
350 |
Section 1b : PENALITY SAMPLERS - ALL APPS:
|
351 |
-
|
|
|
352 |
|
353 |
These samplers "trim" or "prune" output in real time.
|
354 |
|
@@ -418,9 +425,11 @@ penalize newline tokens (default: false)
|
|
418 |
Generally this is not used.
|
419 |
|
420 |
|
421 |
-
|
|
|
422 |
Section 1c : SECONDARY SAMPLERS / FILTERS - ALL APPS:
|
423 |
-
|
|
|
424 |
|
425 |
In some AI/LLM apps, these may only be available via JSON file modification and/or API.
|
426 |
|
@@ -475,7 +484,7 @@ This allows the model to CHANGE temp during generation. This can greatly affect
|
|
475 |
|
476 |
For Koboldcpp a converter is available and in oobabooga/text-generation-webui you just enter low/high/exp.
|
477 |
|
478 |
-
|
479 |
|
480 |
To set manually (IE: Api, lmstudio, Llamacpp, etc) using "range" and "exp" ; this is a bit more tricky: (example is to set range from .8 to 1.8)
|
481 |
|
@@ -534,14 +543,15 @@ I suggest you get some "bad outputs" ; get the "tokens" (actual number for the "
|
|
534 |
Careful testing is required, as this can have unclear side effects.
|
535 |
|
536 |
|
|
|
537 |
|
538 |
-
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
539 |
SECTION 2: ADVANCED SAMPLERS - "text-generation-webui" / "KOBOLDCPP":
|
540 |
|
541 |
Additional Parameters / Samplers, including "DRY", "QUADRATIC" and "ANTI-SLOP".
|
542 |
-
------------------------------------------------------------------------------------------------------------------------------------------------------------
|
543 |
|
544 |
-
|
|
|
|
|
545 |
|
546 |
For more info on what they do / how they affect generation see:
|
547 |
|
@@ -641,7 +651,7 @@ This sampler allows banning words and phrases DURING generation, forcing the mod
|
|
641 |
This is a game changer in custom real time control of the model.
|
642 |
|
643 |
|
644 |
-
|
645 |
|
646 |
Keep in mind that these settings/samplers work in conjunction with "penalties" ; which is especially important
|
647 |
for operation of CLASS 4 models for chat / role play and/or "smoother operation".
|
@@ -650,7 +660,7 @@ For Class 3 models, "QUADRATIC" will have a slightly stronger effect than "DRY"
|
|
650 |
|
651 |
If you use Microstat sampler, keep in mind this will interact with these two advanced samplers too.
|
652 |
|
653 |
-
|
654 |
|
655 |
Smaller quants may require STRONGER settings (all classes of models) due to compression damage, especially for Q2K, and IQ1/IQ2s.
|
656 |
|
|
|
62 |
|
63 |
You will get higher quality operation overall - stronger prose, better answers, and a higher quality adventure.
|
64 |
|
65 |
+
---
|
66 |
|
|
|
67 |
PARAMETERS AND SAMPLERS
|
68 |
+
|
69 |
+
---
|
70 |
|
71 |
Primary Testing Parameters I use, including use for output generation examples at my repo:
|
72 |
|
|
|
135 |
|
136 |
Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
|
137 |
|
138 |
+
In most cases all llama_cpp parameters/samplers are available when using API / headless / server mode in "text-generation-webui", "koboldcpp", "Olama" and "lmstudio" (as well as other apps too).
|
139 |
|
140 |
You can also use llama_cpp directly too. (IE: llama-server.exe) ; see :
|
141 |
|
|
|
202 |
|
203 |
The goal here is to use parameters to raise/lower the power of the model and samplers to "prune" (and/or in some cases enhance) operation.
|
204 |
|
205 |
+
With that being said, generation "examples" (at my repo) are created using the "Primary Testing Parameters" (top of this document) settings regardless of the "class" of the model and no advanced settings, parameters, or samplers.
|
206 |
|
207 |
However, for ANY model regardless of "class" or if it is at my repo, you can now take performance to the next level with the information contained in this document.
|
208 |
|
209 |
+
Side note:
|
210 |
+
|
211 |
+
There are no "Class 5" models published... yet.
|
212 |
|
213 |
---
|
214 |
|
|
|
276 |
However sometimes parameters and/or samplers are required to better "wrangle" the model and getting to perform to its maximum potential and/or fine tune it to your use case(s).
|
277 |
|
278 |
|
279 |
+
---
|
280 |
+
|
281 |
Section 1a : PRIMARY PARAMETERS - ALL APPS:
|
282 |
+
|
283 |
+
---
|
284 |
|
285 |
These parameters will have SIGNIFICANT effect on prose, generation, length and content; with temp being the most powerful.
|
286 |
|
|
|
351 |
Then test "at temp" with your prompt(s) to see the MODELS in action. (5-10 generations recommended)
|
352 |
|
353 |
|
354 |
+
---
|
355 |
+
|
356 |
Section 1b : PENALITY SAMPLERS - ALL APPS:
|
357 |
+
|
358 |
+
---
|
359 |
|
360 |
These samplers "trim" or "prune" output in real time.
|
361 |
|
|
|
425 |
Generally this is not used.
|
426 |
|
427 |
|
428 |
+
---
|
429 |
+
|
430 |
Section 1c : SECONDARY SAMPLERS / FILTERS - ALL APPS:
|
431 |
+
|
432 |
+
---
|
433 |
|
434 |
In some AI/LLM apps, these may only be available via JSON file modification and/or API.
|
435 |
|
|
|
484 |
|
485 |
For Koboldcpp a converter is available and in oobabooga/text-generation-webui you just enter low/high/exp.
|
486 |
|
487 |
+
CLASS 4 only: Suggested this is on, with a high/low of .8 to 1.8 (note the range here of "1" between high and low); with exponent to 1 (however below 0 or above work too)
|
488 |
|
489 |
To set manually (IE: Api, lmstudio, Llamacpp, etc) using "range" and "exp" ; this is a bit more tricky: (example is to set range from .8 to 1.8)
|
490 |
|
|
|
543 |
Careful testing is required, as this can have unclear side effects.
|
544 |
|
545 |
|
546 |
+
---
|
547 |
|
|
|
548 |
SECTION 2: ADVANCED SAMPLERS - "text-generation-webui" / "KOBOLDCPP":
|
549 |
|
550 |
Additional Parameters / Samplers, including "DRY", "QUADRATIC" and "ANTI-SLOP".
|
|
|
551 |
|
552 |
+
---
|
553 |
+
|
554 |
+
Hopefully ALL these samplers / controls will be LLAMACPP and available to all users via AI/LLM apps soon.
|
555 |
|
556 |
For more info on what they do / how they affect generation see:
|
557 |
|
|
|
651 |
This is a game changer in custom real time control of the model.
|
652 |
|
653 |
|
654 |
+
FINAL NOTES:
|
655 |
|
656 |
Keep in mind that these settings/samplers work in conjunction with "penalties" ; which is especially important
|
657 |
for operation of CLASS 4 models for chat / role play and/or "smoother operation".
|
|
|
660 |
|
661 |
If you use Microstat sampler, keep in mind this will interact with these two advanced samplers too.
|
662 |
|
663 |
+
And...
|
664 |
|
665 |
Smaller quants may require STRONGER settings (all classes of models) due to compression damage, especially for Q2K, and IQ1/IQ2s.
|
666 |
|