gemma-2-9b-it-GGUF / README.md
Dracones's picture
Upload folder using huggingface_hub
30058fe verified
metadata
license: gemma
library_name: transformers
pipeline_tag: text-generation
tags:
  - conversational
  - gguf
  - llamacpp

Gemma 2 9b Instruction Tuned - GGUF

These are GGUF quants of google/gemma-2-9b-it

Details about the model can be found at the above model page.

Llamacpp Version

These quants were made with llamacpp tag b3408.

If you have problems loading these models, please update your software to se the latest llamacpp version.

Perplexity Scoring

Below are the perplexity scores for the GGUF models. A lower score is better.

Quant Level Perplexity Score Standard Deviation
F32 8.7849 0.06498
BF16 8.7849 0.06498
Q8_0 8.7869 0.06500
Q6_K 8.7972 0.06510
Q5_K_M 8.7791 0.06489
Q5_K_S 8.7899 0.06503
Q4_K_M 8.8745 0.06575
Q4_K_S 8.9293 0.06636
Q3_K_L 9.0210 0.06693
Q3_K_M 9.1213 0.06784
Q3_K_S 9.1857 0.06726

Quant Details

This is the script used for quantization.

#!/bin/bash

# Define MODEL_NAME above the loop
MODEL_NAME="gemma-2-9b-it"

# Define the output directory
outputDir="${MODEL_NAME}-GGUF"

# Create the output directory if it doesn't exist
mkdir -p "${outputDir}"

# Make the F32 quant
f32file="${outputDir}/${MODEL_NAME}-F32.gguf"
if [ -f "${f32file}" ]; then
    echo "Skipping f32 as ${f32file} already exists."
else
    python convert_hf_to_gguf.py "~/src/models/${MODEL_NAME}" --outfile "${f32file}" --outtype "f32"
fi

# Abort out if the F32 didn't work
if [ ! -f "${f32file}" ]; then
   echo "No ${f32file} found."
   exit 1
fi

# Define the array of quantization strings
quants=("Q8_0" "Q6_K" "Q5_K_M" "Q5_K_S" "Q4_K_M" "Q4_K_S" "Q3_K_L" "Q3_K_M" "Q3_K_S")


# Loop through the quants array
for quant in "${quants[@]}"; do
    outfile="${outputDir}/${MODEL_NAME}-${quant}.gguf"
    
    # Check if the outfile already exists
    if [ -f "${outfile}" ]; then
        echo "Skipping ${quant} as ${outfile} already exists."
    else
        # Run the command with the current quant string
        ./llama-quantize "${f32file}" "${outfile}" "${quant}"
        
        echo "Processed ${quant} and generated ${outfile}"
    fi
done