You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.


language: en license: apache-2.0 base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 pipeline_tag: text-generation tags:

  • sft
  • lora
  • alpaca
  • instruction-tuning
  • tinyllama

TinyLlama Alpaca SFT โ€” Trial A1

Fine-tuned version of TinyLlama-1.1B-Chat-v1.0 using Supervised Fine-Tuning (SFT) with LoRA on the Stanford Alpaca dataset.
This is the best-performing trial (A1) from a 5-trial hyperparameter search.

Fine-tuned with LoRA (r=4, ฮฑ=8) on 4,500 Alpaca samples.
Avg BLEU: 11.75 | Avg BERTScore: 0.9001 (base: 6.74 / 0.8786)

  • Base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
  • Dataset: tatsu-lab/alpaca (5,000 samples, 90/10 split)
  • LoRA: r=4, alpha=8, target=q_proj+v_proj, dropout=0.05
  • LR: 2e-4 | Batch: 4 | Epochs: 1
  • Platform: Kaggle T4 GPU | Training time: ~557s

โš ๏ธ WARNING

This model is a research/academic artifact trained for an NLP course assignment. It lacks safety guardrails and may produce harmful, inaccurate, or inappropriate content. Do not use in production or with end users without implementing proper safety measures.

Safety Recommendations:

  • Add content filtering before any deployment
  • Use with supervised moderation only
  • Not suitable for public-facing applications

Training Details

  • Base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 (1.1B parameters)
  • Dataset: tatsu-lab/alpaca โ€” 5,000 samples (4,500 train / 500 val, seed=42)
  • Method: Supervised Fine-Tuning (SFT) with LoRA via PEFT + TRL

Usage

from transformers import pipeline

pipe = pipeline("text-generation", model="Alizahh/tinyllama-alpaca-sft-A1")

prompt = "### Instruction:\nExplain what inflation means in simple terms.\n\n### Response:\n"
result = pipe(prompt, max_new_tokens=150, do_sample=False)
print(result[0]["generated_text"])
Downloads last month
3
Safetensors
Model size
1B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support