How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF:
# Run inference directly in the terminal:
llama-cli -hf QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF:
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF:
# Run inference directly in the terminal:
llama-cli -hf QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF:
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF:
# Run inference directly in the terminal:
./llama-cli -hf QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF:
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF:
# Run inference directly in the terminal:
./build/bin/llama-cli -hf QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF:
Use Docker
docker model run hf.co/QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF:
Quick Links

QuantFactory Banner

QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF

This is quantized version of Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated created using llama.cpp

Original Model Card

Model Card for Model ID

VersatiLlama-Llama-3.2-3B-Instruct-Abliterated

image/webp

Model Description

Small but Smart

Fine-Tuned on Vast dataset of Conversations

Able to Generate Human like text with high performance within its size.

It is Very Versatile when compared for it's size and Parameters and offers capability almost as good as Llama 3.1 8B Instruct

Feel free to Check it out!!

Check the quantized model here: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-Imatrix-GGUF

[This model was trained for 5hrs on GPU T4 15gb vram]

  • Developed by: Meta AI
  • Fine-Tuned by: Devarui379
  • Model type: Transformers
  • Language(s) (NLP): English
  • License: cc-by-4.0

Model Sources [optional]

base model:meta-llama/Llama-3.2-3B-Instruct

Uses

Use desired System prompt when using in LM Studio The optimal chat template seems to be Jinja but feel free to test it out as you want!

Technical Specifications

Model Architecture and Objective

Llama 3.2

Hardware

NVIDIA TESLA T4 GPU 15GB VRAM

Downloads last month
1,563
GGUF
Model size
3B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF

Quantized
(466)
this model