yahma/alpaca-cleaned
Viewer β’ Updated β’ 51.8k β’ 27.1k β’ 849
How to use zaid646/instruction-follower-lora with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
model = PeftModel.from_pretrained(base_model, "zaid646/instruction-follower-lora")This model is a QLoRA (4-bit) fine-tuned adapter of TinyLlama/TinyLlama-1.1B-Chat-v1.0 on the Alpaca instruction-following dataset.
The adapter was trained using peft + bitsandbytes with the following configuration:
| Hyperparameter | Value |
|---|---|
| LoRA rank (r) | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.1 |
| Target modules | q_proj, v_proj, k_proj, o_proj |
| Quantization | 4-bit NF4, double quant |
| Batch size | 2 (effective 16 with grad accum) |
| Learning rate | 1e-4 |
| Epochs | 2 |
| Max sequence length | 512 |
| Warmup steps | 50 |
| Optimizer | AdamW (paged) |
| Metric | Value |
|---|---|
| Final loss | 1.22 |
| Train samples/sec | 6.96 |
| Train steps/sec | 0.43 |
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
# Base model
base_model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
adapter_name = "zaid646/tinyllama-1.1b-alpaca-qlora"
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
tokenizer.pad_token = tokenizer.eos_token
# Load model with 4-bit quantization
quant_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
)
model = AutoModelForCausalLM.from_pretrained(
base_model_name,
quantization_config=quant_config,
device_map="auto",
torch_dtype=torch.bfloat16,
)
# Load adapter
model = PeftModel.from_pretrained(model, adapter_name)
# Inference
prompt = "### Instruction:\nExplain what machine learning is.\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
This model was trained using the fine-tuning-recipes framework. To reproduce:
git clone https://github.com/ZAID646/fine-tuning-recipes.git
cd fine-tuning-recipes
pip install -e .
python -m src.cli train --config recipes/qlora.yaml
### Instruction:\n...\n### Response:\n)Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0