--- base_model: - elinas/Llama-3-15B-Instruct-zeroed - elinas/Llama-3-15B-Instruct-ft-v2 - PJMixers/LLaMa-3-Stheno-v3.2-Zeroed-15B library_name: transformers tags: - mergekit - merge --- # LLaMa-3-Stheno-v3.2-15B - EXL2 8.08bpw This is a 8bpw EXL2 quant of [PJMixers/LLaMa-3-Stheno-v3.2-15B](https://huggingface.co/PJMixers/LLaMa-3-Stheno-v3.2-15B) This quant was made using exllamav2-0.0.21 with [Bluemoon-light dataset](https://huggingface.co/datasets/ParasiticRogue/Bluemoon-Light) for RP. I tested this quant shortly in some random RPs (including one over 8k context - with RoPE scaling as recommended in webui, maybe with alpha_value a bit higher) and it seems to work fine. During quanting I noticed that a lot of layers in the middle of the model had suspiciously low error values, this resulted in lower quant size as the script must have thought that these layers weren't important and used lower bpw for them. Despite this the model seems to work well, at leat for me. ## Prompt Templates Seems to use llama3 prompt template. ### Original readme below --- # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [elinas/Llama-3-15B-Instruct-zeroed](https://huggingface.co/elinas/Llama-3-15B-Instruct-zeroed) as a base. ### Models Merged The following models were included in the merge: * [elinas/Llama-3-15B-Instruct-ft-v2](https://huggingface.co/elinas/Llama-3-15B-Instruct-ft-v2) * [PJMixers/LLaMa-3-Stheno-v3.2-Zeroed-15B](https://huggingface.co/PJMixers/LLaMa-3-Stheno-v3.2-Zeroed-15B) ### Configuration The following YAML configuration was used to produce this model: ```yaml merge_method: task_arithmetic dtype: bfloat16 base_model: elinas/Llama-3-15B-Instruct-zeroed models: - model: elinas/Llama-3-15B-Instruct-ft-v2 parameters: weight: 1.0 - model: PJMixers/LLaMa-3-Stheno-v3.2-Zeroed-15B parameters: weight: 1.0 ```