Instruction Follower — LoRA

This model is a QLoRA (4-bit) fine-tuned adapter of TinyLlama/TinyLlama-1.1B-Chat-v1.0 on the Alpaca instruction-following dataset.

Base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
Fine-tuning method: QLoRA (4-bit NormalFloat, double quantization)
Dataset: yahma/alpaca-cleaned (51,760 samples)
Training hardware: NVIDIA RTX 3090 (24GB VRAM)
Training time: ~1 hour 36 minutes

Model Details

The adapter was trained using peft + bitsandbytes with the following configuration:

Hyperparameter	Value
LoRA rank (r)	16
LoRA alpha	32
LoRA dropout	0.1
Target modules	q_proj, v_proj, k_proj, o_proj
Quantization	4-bit NF4, double quant
Batch size	2 (effective 16 with grad accum)
Learning rate	1e-4
Epochs	2
Max sequence length	512
Warmup steps	50
Optimizer	AdamW (paged)

Training Results

Metric	Value
Final loss	1.22
Train samples/sec	6.96
Train steps/sec	0.43

How to Use

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel

# Base model
base_model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
adapter_name = "zaid646/tinyllama-1.1b-alpaca-qlora"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
tokenizer.pad_token = tokenizer.eos_token

# Load model with 4-bit quantization
quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
)

model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    quantization_config=quant_config,
    device_map="auto",
    torch_dtype=torch.bfloat16,
)

# Load adapter
model = PeftModel.from_pretrained(model, adapter_name)

# Inference
prompt = "### Instruction:\nExplain what machine learning is.\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Recipe

This model was trained using the fine-tuning-recipes framework. To reproduce:

git clone https://github.com/ZAID646/fine-tuning-recipes.git
cd fine-tuning-recipes
pip install -e .
python -m src.cli train --config recipes/qlora.yaml

Limitations

Fine-tuned on English Alpaca data only — performance on other languages may vary
1.1B parameter base model — may not match larger models on complex reasoning
Instruction format must match Alpaca style (### Instruction:\n...\n### Response:\n)

Downloads last month: 31

Model tree for zaid646/instruction-follower-lora

Base model

TinyLlama/TinyLlama-1.1B-Chat-v1.0

Adapter

(1544)

this model

zaid646
/

instruction-follower-lora