QWEN-3B

MODEL CARD / AUTOMOTIVE DOMAIN ■ PRODUCTION READY

QWEN-3B
AUTOMOTIVE

BASE → Qwen/Qwen2.5-3B-Instruct · METHOD → QLoRA / 4-bit NF4 · SAMPLES → 20,000

QLoRA Automotive MLflow LLM-as-a-Judge PEFT SFT Unsloth 4-bit NF4 bfloat16 TRL BitsAndBytes MIT License

OVERVIEW

Production-ready domain-adapted variant of Qwen2.5-3B-Instruct, fine-tuned on automotive instruction-following data using QLoRA with Unsloth optimization. Features comprehensive evaluation pipeline, dataset engineering, and experiment tracking for enterprise-grade LLM development.

Specialized toward automotive question answering, diagnostic explanations, vehicle maintenance assistance, and technical guidance. Trained on a curated subset of 20,000 samples from the BAAI automotive industry instruction dataset with advanced data quality controls.

Model Identity

BASEQwen/Qwen2.5-3B-Instruct

DEVELOPED BYAlibaba Cloud / Qwen Team

FINE-TUNED BYNasim Raj Laskar

LANGUAGEEnglish

LICENSEMIT

Capabilities

► Automotive Q&A — 96.6% data quality score

► Diagnostic Explanations & Troubleshooting

► Repair & Maintenance Guidance

► Vehicle Systems Knowledge

► Safety-focused Technical Instructions

TRAINING DATA & ENGINEERING

Fine-tuned on 20,000 curated samples from BAAI/IndustryInstruction_Automobiles with a comprehensive data engineering pipeline featuring automated quality controls and versioning.

Dataset Processing Pipeline

FORMATQwen chat-template conversations

STRUCTUREsystem → user → assistant

SYSTEM PROMPT"You are an automotive expert assistant."

SAMPLES~20,000 curated instruction pairs

SPLITS90% train / 5% validation / 5% test

Data Quality Metrics

QUALITY SCORE96.6% average

DUPLICATE RATE<0.01%

AVG TOKENS114.6 per sample

PROMPT TOKENS28.2 avg

RESPONSE TOKENS86.4 avg

◆ DATA QUALITY FEATURES

Duplicate detection and removal (exact + near-duplicate) · Quality scoring with flagging system · Token length filtering (10–512 tokens) · Malformed structure detection · Automated quality reporting · Dataset versioning with quality tracking

TRAINING CONFIGURATION

Method

QLoRA

Quantization

4-bit NF4

Precision

bfloat16

Optimizer

AdamW 8bit

Max Seq Len

512

LR Schedule

Cosine

► LoRA Adapter Parameters

Rank (r)

Alpha (α)

Dropout

0.0

Trainable Params

~30M

Total Params

~3.1B

Infrastructure

PLATFORMAWS SageMaker

GPUNVIDIA L4

TRACKINGMLflow + DagsHub

Libraries

Transformers · TRL · PEFT · BitsAndBytes

Accelerate · Datasets · Unsloth · MLflow

◆ TRAINING PIPELINE FEATURES

Validation-based early stopping · Overfitting detection (threshold: 2.0) · Best checkpoint selection via eval_loss · Gradient checkpointing for memory efficiency · MLflow experiment tracking with DagsHub · Target modules: all attention + MLP projections

EVALUATION RESULTS

MULTI-METRIC EVALUATION SUITE · 20 SAMPLES · LLM-AS-A-JUDGE: GROQ LLAMA 3.3-70B

Core Performance

PERPLEXITY5.47

BLEU SCORE16.3%

SIMILARITY SCORE19.7%

AVG LATENCY1,066 ms

THROUGHPUT18.8 tok/sec

LLM-as-a-Judge (0–10)

HELPFULNESS7.4 / 10

CORRECTNESS8.3 / 10

COHERENCE8.8 / 10

INSTRUCTION FOLLOWING7.0 / 10

HALLUCINATION RISK ↓8.4 / 10

SAFETY9.2 / 10

GPU Metrics

PEAK VRAM6.26 GB

AVG GPU UTIL72.6%

MAX GPU UTIL100.0%

Training Throughput

AVG TRAINING175 tok/sec

MAX TRAINING643 tok/sec

PRODUCTION FEATURES

Experiment Tracking & Monitoring

► MLflow integration with DagsHub remote tracking

► Automated parameter and metric logging

► GPU profiling with VRAM and power monitoring

► Runtime configuration capture

► Git metadata tracking

Evaluation Framework

► Multi-metric evaluation suite

► LLM-as-a-Judge via external API (Groq)

► Pairwise comparison capabilities

► Automated post-training evaluation

► Performance benchmarking

◆ DATA ENGINEERING PIPELINE

Automated dataset quality analysis · Duplicate detection and removal · Quality scoring and filtering · Dataset versioning system · Comprehensive quality reporting

EXAMPLE USAGE

Python

# Load model from Hugging Face Hub
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "Nasim435/Qwen-3B-Automotive-20K"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model     = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    torch_dtype="auto"
)
prompt   = "Explain symptoms of a failing alternator and diagnostic steps."
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
inputs  = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=200,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

LIMITATIONS & SAFETY

⚠ WARNING

Research model — experimental fine-tune, not intended for production safety systems
Hallucination risk — may generate inaccurate automotive advice (8.4/10 risk score)
Safety critical — not suitable for safety-critical or professional mechanical decision-making
Domain scope — trained on 20K samples; generalization beyond automotive may be limited
Quality assurance — 96.6% data quality score with 263 flagged samples requiring review

TECHNICAL ARCHITECTURE

Memory Optimization

► Unsloth FastLanguageModel integration

► 4-bit quantization with NF4 format

► Gradient checkpointing for memory efficiency

► Peak VRAM: 6.26 GB — consumer GPU compatible

Performance Optimizations

► Fused attention kernels via Unsloth

► Optimized transformer implementations

► Efficient LoRA adapter injection

► 2–5x speedup over standard implementations

◆ MONITORING & OBSERVABILITY

Real-time GPU utilization tracking · Memory usage profiling · Training throughput monitoring · Automated performance benchmarking

ACKNOWLEDGEMENTS

Qwen TeamAlibaba Cloud

BAAIDataset Contributors

Hugging FaceEcosystem

TRL / PEFTContributors

UnslothContributors

BitsAndBytesQuantization

MLflow / DagsHubExperiment Tracking

GroqLLM-as-a-Judge API

Downloads last month: 14

Safetensors

Model size

3B params

Tensor type

F32

BF16

Model tree for Nasim435/Qwen-3B-Automotive-20K

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Adapter

(1307)

this model

Nasim435
/

Qwen-3B-Automotive-20K

Model tree for Nasim435/Qwen-3B-Automotive-20K

Dataset used to train Nasim435/Qwen-3B-Automotive-20K