YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

ENG_llmV03 β€” 95M Parameter Language Model

Built entirely from scratch in PyTorch. No pretrained weights. No from_pretrained().

Performance

Metric Value
Base PPL (WikiText-103) 24.40
GPT-3 Small PPL (reference) 26.0
Fine-tuned PPL (two-stage LoRA) 20.83
Trainable params via LoRA ~1.6M (1.8%)
Training hardware RTX 5050 (8.5GB VRAM)

Architecture

  • RoPE positional encoding
  • SwiGLU activation
  • 12-layer Transformer
  • 95M parameters
  • Trained on WikiText-103 (103M tokens)

Fine-Tuning

Two-stage LoRA: R128 β†’ merged β†’ R64
Dataset: 355k clean QA pairs (SciQ + ELI5 + FreebaseQA)

Full Documentation

Technical documentation β†’

Built by Debarun Das

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support