Edit model card

Gemma-2-9B-CPT-SahabatAI-Instruct GGUF

This is a GGUF quantized version of Gemma 2 9B, fine-tuned with custom instructions by SahabatAI and optimized for CPU inference using Q4_K_M quantization.

Model Details

  • Base Model: Gemma 2 9B
  • Instruction Format: SahabatAI Instruct v1
  • Quantization: GGUF Q4_K_M (4-bit with Medium precision for Key/Value cache)
  • Original Size: 18GB
  • Quantized Size: ~5GB
  • Context Length: 8192 tokens
  • License: Gemma Terms of Use

Description

This model is a quantized version of Gemma 2 9B, fine-tuned with custom instruction format by SahabatAI. The Q4_K_M quantization provides a good balance between model size, speed, and quality. The instruction format is optimized for general-purpose tasks while maintaining model coherence and reliability.

Usage

oobabooga's text-generation-webui Setup

  1. Install text-generation-webui:
git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
run start-* depends on your OS
  1. Download Model:
mkdir models
cd models
# Download gemma2-9B-cpt-sahabatai-instruct-v1-Q4_K_M.gguf from Hugging Face

Recommended Generation Parameters

temperature: 0.7
top_p: 0.9
top_k: 40
repetition_penalty: 1.1
max_new_tokens: 2048

Example Outputs

<|system|>You are a helpful AI assistant.</|system|>

<|user|>What is the capital of Indonesia?</|user|>

<|assistant|>Jakarta is the capital city of Indonesia. It is located on the northwestern coast of Java, the most populous island in Indonesia. Jakarta serves as the country's economic, cultural, and political center.

<|user|>Write a simple Python function to calculate factorial.</|user|>

<|assistant|>Here's a simple recursive function to calculate factorial:

def factorial(n):
    if n == 0 or n == 1:
        return 1
    return n * factorial(n-1)

License

This model is subject to the Gemma Terms of Use. Please refer to Google's Gemma licensing terms for commercial usage.

Acknowledgments

  • SahabatAI for fine-tuning the model
  • Google for the Gemma 2 base model
  • llama.cpp for GGUF conversion tools
  • oobabooga for text-generation-webui

Support

For issues and questions:

  • Open an issue in this repository
  • Discord: [Your Discord Link]
Downloads last month
23
GGUF
Model size
9.24B params
Architecture
gemma2

4-bit

Inference API
Unable to determine this model's library. Check the docs .