Edit model card

This repository contains the unquantized Hermes+LIMARP merge in ggml format.

You can quantize the f16 ggml to the quantization of your choice by following the below steps:

  1. Download and extract the llama.cpp binaries (or compile it yourself if you're on Linux)
  2. Move the "quantize" executable to the same folder where you downloaded the f16 ggml model.
  3. Open a command prompt window in that same folder and write the following command, making the changes that you see fit.
quantize.exe hermes-limarp-13b.ggmlv3.f16.bin hermes-limarp-13b.ggmlv3.q4_0.bin q4_0
  1. Press enter to run the command and the quantized model will be generated in the folder.
Downloads last month

-

Downloads are not tracked for this model. How to track
Unable to determine this model's library. Check the docs .