Debarun12
/

ENG-SLM-FINETUNED

Model card Files Files and versions

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

ENG_llmV03 — 95M Parameter Language Model

Built entirely from scratch in PyTorch. No pretrained weights. No from_pretrained().

Performance

Metric	Value
Base PPL (WikiText-103)	24.40
GPT-3 Small PPL (reference)	26.0
Fine-tuned PPL (two-stage LoRA)	20.83
Trainable params via LoRA	~1.6M (1.8%)
Training hardware	RTX 5050 (8.5GB VRAM)

Architecture

RoPE positional encoding
SwiGLU activation
12-layer Transformer
95M parameters
Trained on WikiText-103 (103M tokens)

Fine-Tuning

Two-stage LoRA: R128 → merged → R64
Dataset: 355k clean QA pairs (SciQ + ELI5 + FreebaseQA)

Full Documentation

Technical documentation →

Built by Debarun Das

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support