Mistral-7B QLoRA SFT โ€” Fine-tuned on Alpaca

A QLoRA adapter for Mistral-7B-v0.1, fine-tuned on the Stanford Alpaca dataset using Supervised Fine-Tuning (SFT).

Author: Beybars

Model Details

Parameter Value
Base model mistralai/Mistral-7B-v0.1
Method QLoRA (4-bit NF4 quantization)
LoRA rank (r) 16
LoRA alpha 16
LoRA dropout 0.1
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Dataset tatsu-lab/alpaca (~51K examples)
Epochs 3
Learning rate 1e-4
Batch size 4 (gradient accumulation 4, effective 16)
Optimizer paged_adamw_8bit
Precision bfloat16
GPU NVIDIA RTX 3090 (24GB)

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model in 4-bit
from transformers import BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-v0.1",
    quantization_config=bnb_config,
    device_map="auto",
)

# Load tokenizer and resize embeddings
tokenizer = AutoTokenizer.from_pretrained("Beybars/mistral-7b-qlora-sft")
base_model.resize_token_embeddings(len(tokenizer))

# Load adapter
model = PeftModel.from_pretrained(base_model, "Beybars/mistral-7b-qlora-sft")

# Generate
prompt = "### Instruction:\nExplain quantum computing in simple terms.\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Results

Metric Value
Final training loss ~1.05
Final eval loss ~1.10
Gradient norm Stable ~0.5โ€“0.6
Training time ~15 hours (3 epochs)

Intended Use

This is a learning project adapter. It demonstrates QLoRA fine-tuning on a consumer GPU. Not intended for production use.

Limitations

  • Trained on the Alpaca dataset which contains known quality issues
  • Base model (Mistral-7B-v0.1) has its own biases and limitations
  • Single-turn instruction following only

Framework Versions

  • PEFT: 0.18.1
  • TRL: 0.27.2
  • Transformers: 5.0.0
  • PyTorch: 2.10.0
  • bitsandbytes: 0.46.0
  • Datasets: 4.5.0

Source Code

Training code: github.com/beybars1/llm-tuning

Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Beybars/mistral-7b-qlora-sft

Adapter
(2470)
this model

Paper for Beybars/mistral-7b-qlora-sft