Quantization made by Richard Erkhov.

Llama-3-15B-Instruct-zeroed - GGUF

Model creator: https://huggingface.co/elinas/
Original model: https://huggingface.co/elinas/Llama-3-15B-Instruct-zeroed/

Name	Quant method	Size
Llama-3-15B-Instruct-zeroed.Q2_K.gguf	Q2_K	5.35GB
Llama-3-15B-Instruct-zeroed.IQ3_XS.gguf	IQ3_XS	5.94GB
Llama-3-15B-Instruct-zeroed.IQ3_S.gguf	IQ3_S	6.24GB
Llama-3-15B-Instruct-zeroed.Q3_K_S.gguf	Q3_K_S	6.21GB
Llama-3-15B-Instruct-zeroed.IQ3_M.gguf	IQ3_M	6.43GB
Llama-3-15B-Instruct-zeroed.Q3_K.gguf	Q3_K	6.87GB
Llama-3-15B-Instruct-zeroed.Q3_K_M.gguf	Q3_K_M	6.87GB
Llama-3-15B-Instruct-zeroed.Q3_K_L.gguf	Q3_K_L	7.43GB
Llama-3-15B-Instruct-zeroed.IQ4_XS.gguf	IQ4_XS	7.68GB
Llama-3-15B-Instruct-zeroed.Q4_0.gguf	Q4_0	8.0GB
Llama-3-15B-Instruct-zeroed.IQ4_NL.gguf	IQ4_NL	8.08GB
Llama-3-15B-Instruct-zeroed.Q4_K_S.gguf	Q4_K_S	3.58GB
Llama-3-15B-Instruct-zeroed.Q4_K.gguf	Q4_K	8.48GB
Llama-3-15B-Instruct-zeroed.Q4_K_M.gguf	Q4_K_M	8.48GB
Llama-3-15B-Instruct-zeroed.Q4_1.gguf	Q4_1	8.84GB
Llama-3-15B-Instruct-zeroed.Q5_0.gguf	Q5_0	9.68GB
Llama-3-15B-Instruct-zeroed.Q5_K_S.gguf	Q5_K_S	9.68GB
Llama-3-15B-Instruct-zeroed.Q5_K.gguf	Q5_K	9.93GB
Llama-3-15B-Instruct-zeroed.Q5_K_M.gguf	Q5_K_M	9.93GB
Llama-3-15B-Instruct-zeroed.Q5_1.gguf	Q5_1	10.53GB
Llama-3-15B-Instruct-zeroed.Q6_K.gguf	Q6_K	11.48GB
Llama-3-15B-Instruct-zeroed.Q8_0.gguf	Q8_0	14.86GB

Original model description:

base_model: - meta-llama/Meta-Llama-3-8B-Instruct library_name: transformers tags: - mergekit - merge license: llama3

Llama-3-15B-Instruct-zeroed

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the passthrough merge method while zeroing o_proj and down_proj which led to an decrease in perplexity (good) compared to similar 15B merges. This was a recommendation from Charles Goddard - thank you for sharing the method of merging as well as Toasty Pigeon for bringing it to my attention!

Finetuned Version

A finetuned version of this model can be found at elinas/Llama-3-15B-Instruct-zeroed-ft which seems to improve performance.

Models Merged

The following models were included in the merge:

meta-llama/Meta-Llama-3-8B-Instruct

Configuration

The following YAML configuration was used to produce this model:

dtype: bfloat16
merge_method: passthrough
slices:
- sources:
  - layer_range: [0, 24]
    model: meta-llama/Meta-Llama-3-8B-Instruct
- sources:
  - layer_range: [8, 24]
    model: meta-llama/Meta-Llama-3-8B-Instruct
    parameters:
      scale:
      - filter: o_proj
        value: 0.0
      - filter: down_proj
        value: 0.0
      - value: 1.0
- sources:
  - layer_range: [8, 24]
    model: meta-llama/Meta-Llama-3-8B-Instruct
    parameters:
      scale:
      - filter: o_proj
        value: 0.0
      - filter: down_proj
        value: 0.0
      - value: 1.0
- sources:
  - layer_range: [24, 32]
    model: meta-llama/Meta-Llama-3-8B-Instruct