YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
ENG_llmV03 β 95M Parameter Language Model
Built entirely from scratch in PyTorch. No pretrained weights. No from_pretrained().
Performance
| Metric | Value |
|---|---|
| Base PPL (WikiText-103) | 24.40 |
| GPT-3 Small PPL (reference) | 26.0 |
| Fine-tuned PPL (two-stage LoRA) | 20.83 |
| Trainable params via LoRA | ~1.6M (1.8%) |
| Training hardware | RTX 5050 (8.5GB VRAM) |
Architecture
- RoPE positional encoding
- SwiGLU activation
- 12-layer Transformer
- 95M parameters
- Trained on WikiText-103 (103M tokens)
Fine-Tuning
Two-stage LoRA: R128 β merged β R64
Dataset: 355k clean QA pairs (SciQ + ELI5 + FreebaseQA)
Full Documentation
Built by Debarun Das
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support