Codette-Reasoning / MODEL_DOWNLOAD.md
Raiff1982's picture
Upload 78 files
d574a3d verified

Codette Model Downloads

All production models and adapters are available on HuggingFace: https://huggingface.co/Raiff1982

Quick Download

Option 1: Auto-Download (Recommended)

pip install huggingface-hub

# Download directly
huggingface-cli download Raiff1982/Meta-Llama-3.1-8B-Instruct-Q4 \
  --local-dir models/base/

huggingface-cli download Raiff1982/Llama-3.2-1B-Instruct-Q8 \
  --local-dir models/base/

# Download adapters
huggingface-cli download Raiff1982/Codette-Adapters \
  --local-dir adapters/

Option 2: Manual Download

  1. Visit: https://huggingface.co/Raiff1982
  2. Select model repository
  3. Click "Files and versions"
  4. Download .gguf files to models/base/
  5. Download adapters to adapters/

Option 3: Using Git-LFS

git clone https://huggingface.co/Raiff1982/Meta-Llama-3.1-8B-Instruct-Q4
git lfs pull

Available Models

All models are quantized GGUF format (optimized for llama.cpp and similar):

Model Size Location Type
Llama 3.1 8B Q4 4.6 GB Raiff1982/Meta-Llama-3.1-8B-Instruct-Q4 Default (recommended)
Llama 3.1 8B F16 3.4 GB Raiff1982/Meta-Llama-3.1-8B-Instruct-F16 High quality
Llama 3.2 1B Q8 1.3 GB Raiff1982/Llama-3.2-1B-Instruct-Q8 Lightweight/CPU
Codette Adapters 224 MB Raiff1982/Codette-Adapters 8 LORA weights

Setup Instructions

Step 1: Clone Repository

git clone https://github.com/Raiff1982/Codette-Reasoning.git
cd Codette-Reasoning

Step 2: Install Dependencies

pip install -r requirements.txt

Step 3: Download Models

# Quick method using huggingface-cli
huggingface-cli download Raiff1982/Meta-Llama-3.1-8B-Instruct-Q4 \
  --local-dir models/base/

huggingface-cli download Raiff1982/Llama-3.2-1B-Instruct-Q8 \
  --local-dir models/base/

huggingface-cli download Raiff1982/Codette-Adapters \
  --local-dir adapters/

Step 4: Verify Setup

ls -lh models/base/     # Should show 3 GGUF files
ls adapters/*.gguf      # Should show 8 adapters

Step 5: Start Server

python inference/codette_server.py
# Visit http://localhost:7860

HuggingFace Profile

All models hosted at: https://huggingface.co/Raiff1982

Models include:

  • Complete documentation
  • Model cards with specifications
  • License information
  • Version history

Offline Setup

If you have models downloaded locally:

# Just copy files to correct location
cp /path/to/models/*.gguf models/base/
cp /path/to/adapters/*.gguf adapters/

Troubleshooting Downloads

Issue: "Connection timeout"

# Increase timeout
huggingface-cli download Raiff1982/Meta-Llama-3.1-8B-Instruct-Q4 \
  --local-dir models/base/ \
  --resume-download

Issue: "Disk space full"

Each model needs:

  • Llama 3.1 8B Q4: 4.6 GB
  • Llama 3.1 8B F16: 3.4 GB
  • Llama 3.2 1B: 1.3 GB
  • Adapters: ~1 GB
  • Total: ~10 GB minimum

Issue: "HuggingFace token required"

huggingface-cli login
# Paste token from: https://huggingface.co/settings/tokens

Bandwidth & Speed

Typical download times:

  • Llama 3.1 8B Q4: 5-15 minutes (100 Mbps connection)
  • Llama 3.2 1B: 2-5 minutes
  • Adapters: 1-2 minutes
  • Total: 8-22 minutes (first-time setup)

Attribution

Models:

  • Llama: Meta AI (open source)
  • GGUF Quantization: Ollama/ggerganov
  • Adapters: Jonathan Harrison (Raiff1982)

License: See individual model cards on HuggingFace


Once downloaded, follow DEPLOYMENT.md for production setup.

For questions, visit: https://huggingface.co/Raiff1982