Midnight-Miqu-103B-v1.0 - GGUF

These are GGUF quants of sophosympatheia/Midnight-Miqu-103B-v1.0

Details about the model and the merge info can be found at the above mode page.

Note: I'd recommend checking out mradermacher/Midnight-Miqu-103B-v1.0-GGUF quants as well. He has IQ quants which are likely better than my non-IQ ones.

GGUF File sizes

Name	Disk Size (GB)
Midnight-Miqu-103B-v1.0-Q2_K.gguf	35.31
Midnight-Miqu-103B-v1.0-IQ3_XXS.gguf	39.14
Midnight-Miqu-103B-v1.0-Q3_K_XS.gguf	39.08
Midnight-Miqu-103B-v1.0-Q3_K_S.gguf	41.40
Midnight-Miqu-103B-v1.0-Q3_K_M.gguf	46.20
Midnight-Miqu-103B-v1.0-Q3_K_L.gguf	50.35
Midnight-Miqu-103B-v1.0-Q4_0.gguf	54.13
Midnight-Miqu-103B-v1.0-Q4_K_S.gguf	54.55
Midnight-Miqu-103B-v1.0-Q4_K_M.gguf	57.64
Midnight-Miqu-103B-v1.0-Q5_0.gguf	66.12
Midnight-Miqu-103B-v1.0-Q5_K_S.gguf	66.12
Midnight-Miqu-103B-v1.0-Q5_K_M.gguf	67.92
Midnight-Miqu-103B-v1.0-Q6_K.gguf	78.85
Midnight-Miqu-103B-v1.0-Q8_0.gguf	102.13

Joining split files

Note: HF does not support uploading files larger than 50GB. Therefore I have uploaded some quants as split files.

For split files, please download all parts of the file.

To join the files, do the following. The below example is for the Q6_K quant.

Linux and macOS:

cat Midnight-Miqu-103B-v1.0-Q6_K.gguf-part-* > Midnight-Miqu-103B-v1.0-Q6_K.gguf && rm Midnight-Miqu-103B-v1.0-Q6_K.gguf-part-*

Windows command line:

COPY /B Midnight-Miqu-103B-v1.0-Q6_K.gguf-part-a + Midnight-Miqu-103B-v1.0-Q6_K.gguf-part-b Midnight-Miqu-103B-v1.0-Q6_K.gguf
del Midnight-Miqu-103B-v1.0-Q6_K.gguf-part-a Midnight-Miqu-103B-v1.0-Q6_K.gguf-part-b

Split details

For reference, below are the commands used to create the splits:

split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q3_K_L.gguf Midnight-Miqu-103B-v1.0-Q3_K_L-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q4_0.gguf Midnight-Miqu-103B-v1.0-Q4_0-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q4_K_M.gguf Midnight-Miqu-103B-v1.0-Q4_K_M-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q4_K_S.gguf Midnight-Miqu-103B-v1.0-Q4_K_S-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q5_0.gguf Midnight-Miqu-103B-v1.0-Q5_0-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q5_K_M.gguf Midnight-Miqu-103B-v1.0-Q5_K_M-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q5_K_S.gguf Midnight-Miqu-103B-v1.0-Q5_K_S-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q6_K.gguf Midnight-Miqu-103B-v1.0-Q6_K-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q8_0.gguf Midnight-Miqu-103B-v1.0-Q8_0-part-

Quant Details

For reference, this is the script used for quantization.

#!/bin/bash

# Activate the conda environment
source ~/miniconda3/etc/profile.d/conda.sh
conda activate llamacpp

# Define MODEL_NAME above the loop
MODEL_NAME="Midnight-Miqu-103B-v1.0"

# Define the output directory
outputDir="${MODEL_NAME}-GGUF"

# Create the output directory if it doesn't exist
mkdir -p "${outputDir}"

# Make the F32 quant
f32file="/mnt/storage/models/GGUF/${MODEL_NAME}-F32.gguf"
if [ -f "${f32file}" ]; then
    echo "Skipping f32 as ${f32file} already exists."
else
    python convert.py "~/src/models/${MODEL_NAME}" --outfile "${f32file}" --outtype "f32"
fi

# Define the array of quantization strings
quants=("Q2_K" "IQ3_XXS" "Q3_K_L" "Q3_K_M" "Q3_K_S" "Q3_K_XS" "Q4_0" "Q4_K_M" "Q4_K_S" "Q5_0" "Q5_K_M" "Q5_K_S" "Q6_K" "Q8_0")

# Loop through the quants array
for quant in "${quants[@]}"; do
    outfile="${outputDir}/${MODEL_NAME}-${quant}.gguf"
    
    # Check if the outfile already exists
    if [ -f "${outfile}" ]; then
        echo "Skipping ${quant} as ${outfile} already exists."
    else
        # Run the command with the current quant string
        ./quantize "${f32file}" "${outfile}" "${quant}"
        
        echo "Processed ${quant} and generated ${outfile}"
    fi
done

Dracones
/

Midnight-Miqu-103B-v1.0-GGUF

Midnight-Miqu-103B-v1.0 - GGUF

GGUF File sizes

Joining split files

Split details

Quant Details

Collection including Dracones/Midnight-Miqu-103B-v1.0-GGUF

Midnight Miqu v1.0