metadata
license: other
tags:
- llama-cpp
base_model: migtissera/Tess-v2.5-Qwen2-72B
pabloce/Tess-v2.5-Qwen2-72B
This model is a converted version of migtissera/Tess-v2.5-Qwen2-72B
in GGUF format.
For more details on the original model, please refer to its model card.
Installation
To use this model with llama.cpp, you can install llama.cpp through brew on Mac and Linux:
brew install llama.cpp
Usage
Command Line Interface (CLI)
To use the model via the CLI, run the following command:
llama --hf-repo pabloce/Tess-v2.5-Qwen2-72B-gguff --hf-file tess-2.5-qwen-2-70b-q3_k_m.gguf -p "The meaning to life and the universe is"
Server
To start the llama.cpp server with this model, use the following command:
llama-server --hf-repo pabloce/Tess-v2.5-Qwen2-72B-gguff --hf-file tess-2.5-qwen-2-70b-q3_k_m.gguf -c 2048
Alternative Usage
You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repository.
Clone the llama.cpp repository from GitHub:
git clone https://github.com/ggerganov/llama.cpp
Navigate to the llama.cpp folder and build it with the
LLAMA_CURL=1
flag. You can also include other hardware-specific flags (e.g.,LLAMA_CUDA=1
for Nvidia GPUs on Linux):cd llama.cpp && LLAMA_CURL=1 make
Run inference through the main binary:
./main --hf-repo pabloce/Tess-v2.5-Qwen2-72B-gguf --hf-file tess-2.5-qwen-2-70b-q3_k_m.gguf -p "The meaning to life and the universe is"
or start the server:
./server --hf-repo pabloce/Tess-v2.5-Qwen2-72B-gguf --hf-file tess-2.5-qwen-2-70b-q3_k_m.gguf -c 2048