DavidAU
/

Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters

@@ -62,10 +62,11 @@ Likewise because of how modern AIs/LLMs operate the previously generated (qualit
 You will get higher quality operation overall - stronger prose, better answers, and a higher quality adventure.
-------------------------------------------------------------------------------------------------------------------------------------------------------------
 PARAMETERS AND SAMPLERS
-------------------------------------------------------------------------------------------------------------------------------------------------------------
 Primary Testing Parameters I use, including use for output generation examples at my repo:
@@ -134,7 +135,7 @@ OTHER PROGRAMS:
 Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
-In most cases all llama_cpp settings are available when using API / headless / server mode in "text-generation-webui", "koboldcpp", "Olama" and "lmstudio" (as well as other apps too).
 You can also use llama_cpp directly too. (IE: llama-server.exe) ; see :
@@ -201,11 +202,13 @@ and/or want to use these models for other than intended use case(s) and that is
 The goal here is to use parameters to raise/lower the power of the model and samplers to "prune" (and/or in some cases enhance) operation.
-With that being said, generation "examples" (at my repo) are created using the "Primary Testing Parameters" (top of this document) settings regardless of the "class" of the model AND NO advanced settings, or samplers.
 However, for ANY model regardless of "class" or if it is at my repo, you can now take performance to the next level with the information contained in this document.
-Side note: There are no class 5 models published... yet.
 ---
@@ -273,9 +276,11 @@ This will help the model fine tune your prompt so IT understands it.
 However sometimes parameters and/or samplers are required to better "wrangle" the model and getting to perform to its maximum potential and/or fine tune it to your use case(s).
-------------------------------------------------------------------------------
 Section 1a : PRIMARY PARAMETERS - ALL APPS:
-------------------------------------------------------------------------------
 These parameters will have SIGNIFICANT effect on prose, generation, length and content; with temp being the most powerful.
@@ -346,9 +351,11 @@ Another option is testing different models (at temp=0 AND of the same quant) to
 Then test "at temp" with your prompt(s) to see the MODELS in action. (5-10 generations recommended)
-------------------------------------------------------------------------------
 Section 1b : PENALITY SAMPLERS - ALL APPS:
-------------------------------------------------------------------------------
 These samplers "trim" or "prune" output in real time.
@@ -418,9 +425,11 @@ penalize newline tokens (default: false)
 Generally this is not used.
-------------------------------------------------------------------------------
 Section 1c : SECONDARY SAMPLERS / FILTERS - ALL APPS:
-------------------------------------------------------------------------------
 In some AI/LLM apps, these may only be available via JSON file modification and/or API.
@@ -475,7 +484,7 @@ This allows the model to CHANGE temp during generation. This can greatly affect
 For Koboldcpp a converter is available and in oobabooga/text-generation-webui you just enter low/high/exp.
-Class 4 only: Suggested this is on, with a high/low of .8 to 1.8 (note the range here of "1" between high and low); with exponent to 1 (however below 0 or above work too)
 To set manually (IE: Api, lmstudio, Llamacpp, etc) using "range" and "exp" ; this is a bit more tricky: (example is to set range from .8 to 1.8)
@@ -534,14 +543,15 @@ I suggest you get some "bad outputs" ; get the "tokens" (actual number for the "
 Careful testing is required, as this can have unclear side effects.
-------------------------------------------------------------------------------------------------------------------------------------------------------------
 SECTION 2: ADVANCED SAMPLERS - "text-generation-webui" / "KOBOLDCPP":
 Additional Parameters / Samplers, including "DRY", "QUADRATIC" and "ANTI-SLOP".
-------------------------------------------------------------------------------------------------------------------------------------------------------------
-Hopefully these samplers / controls will be LLAMACPP and available to all users via AI/LLM apps soon.
 For more info on what they do / how they affect generation see:
@@ -641,7 +651,7 @@ This sampler allows banning words and phrases DURING generation, forcing the mod
 This is a game changer in custom real time control of the model.
-IMPORTANT:
 Keep in mind that these settings/samplers work in conjunction with "penalties" ; which is especially important
 for operation of CLASS 4 models for chat / role play and/or "smoother operation".
@@ -650,7 +660,7 @@ For Class 3 models, "QUADRATIC" will have a slightly stronger effect than "DRY"
 If you use Microstat sampler, keep in mind this will interact with these two advanced samplers too.
-Finally:
 Smaller quants may require STRONGER settings (all classes of models) due to compression damage, especially for Q2K, and IQ1/IQ2s.

 You will get higher quality operation overall - stronger prose, better answers, and a higher quality adventure.
+---
 PARAMETERS AND SAMPLERS
+---
 Primary Testing Parameters I use, including use for output generation examples at my repo:
 Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
+In most cases all llama_cpp parameters/samplers are available when using API / headless / server mode in "text-generation-webui", "koboldcpp", "Olama" and "lmstudio" (as well as other apps too).
 You can also use llama_cpp directly too. (IE: llama-server.exe) ; see :
 The goal here is to use parameters to raise/lower the power of the model and samplers to "prune" (and/or in some cases enhance) operation.
+With that being said, generation "examples" (at my repo) are created using the "Primary Testing Parameters" (top of this document) settings regardless of the "class" of the model and no advanced settings, parameters, or samplers.
 However, for ANY model regardless of "class" or if it is at my repo, you can now take performance to the next level with the information contained in this document.
+Side note:
+There are no "Class 5" models published... yet.
 ---
 However sometimes parameters and/or samplers are required to better "wrangle" the model and getting to perform to its maximum potential and/or fine tune it to your use case(s).
+---
 Section 1a : PRIMARY PARAMETERS - ALL APPS:
+---
 These parameters will have SIGNIFICANT effect on prose, generation, length and content; with temp being the most powerful.
 Then test "at temp" with your prompt(s) to see the MODELS in action. (5-10 generations recommended)
+---
 Section 1b : PENALITY SAMPLERS - ALL APPS:
+---
 These samplers "trim" or "prune" output in real time.
 Generally this is not used.
+---
 Section 1c : SECONDARY SAMPLERS / FILTERS - ALL APPS:
+---
 In some AI/LLM apps, these may only be available via JSON file modification and/or API.
 For Koboldcpp a converter is available and in oobabooga/text-generation-webui you just enter low/high/exp.
+CLASS 4 only: Suggested this is on, with a high/low of .8 to 1.8 (note the range here of "1" between high and low); with exponent to 1 (however below 0 or above work too)
 To set manually (IE: Api, lmstudio, Llamacpp, etc) using "range" and "exp" ; this is a bit more tricky: (example is to set range from .8 to 1.8)
 Careful testing is required, as this can have unclear side effects.
+---
 SECTION 2: ADVANCED SAMPLERS - "text-generation-webui" / "KOBOLDCPP":
 Additional Parameters / Samplers, including "DRY", "QUADRATIC" and "ANTI-SLOP".
+---
+Hopefully ALL these samplers / controls will be LLAMACPP and available to all users via AI/LLM apps soon.
 For more info on what they do / how they affect generation see:
 This is a game changer in custom real time control of the model.
+FINAL NOTES:
 Keep in mind that these settings/samplers work in conjunction with "penalties" ; which is especially important
 for operation of CLASS 4 models for chat / role play and/or "smoother operation".
 If you use Microstat sampler, keep in mind this will interact with these two advanced samplers too.
+And...
 Smaller quants may require STRONGER settings (all classes of models) due to compression damage, especially for Q2K, and IQ1/IQ2s.