YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

TinyGPT-60M

A compact 60M parameter language model trained from scratch for efficient local inference.

Model Details

Architecture: GPT-2-style transformer
Parameters: 60M
Training Data: SlimPajama + Wikipedia + conversation examples
Training Method: Supervised fine-tuning with heavy overtraining (~10x Chinchilla-optimal)
Inference Speed: 45-55 tokens/second on mobile (Pixel 6a)

What It Does Well

  • Definitions & Explanations: Clean, coherent answers to "What is X?" questions
  • Factual Recall: Simple closed-ended Q&A with known answers
  • Clean Output: Proper grammar, no word salad, stays on topic for short contexts
  • Local Inference: Runs entirely on-device with minimal memory footprint (250MB GGUF)

What It Doesn't Do

  • Multi-step Reasoning: Can't chain logic across multiple steps
  • Long-form Coherence: Struggles past ~150 tokens, starts repeating itself
  • Code Generation: Will produce syntactically valid but semantically broken code
  • Memory: No context retention between turns β€” each prompt is fresh
  • Binary Choices: "A or B?" questions tend to confuse it
  • Factual Consistency: May hallucinate or default to generic templates (e.g., confident George Washington is still president)

Best Practices

βœ… Use for: Single-turn Q&A, definitions, summarization, chat assistants with short responses

❌ Don't use for: Sustained reasoning, code you'll actually run, factual lookups without verification

Capabilities by Task

Task Works? Notes
Simple math (2+2) βœ… Single-step only
Definitions βœ…βœ… Strong β€” this is the sweet spot
Stories ⚠️ Coherent for 2-3 sentences, then loops
Lists/Recipes ❌ Degrades into repetition
Code ❌ Valid syntax, broken logic
Multi-turn chat ⚠️ No memory between turns

Limitations

This is a tiny model. It's genuinely impressive for its size β€” most 60M parameter models produce incoherent garbage. But it hits hard capability walls:

  • No working memory: Can't track context across paragraphs
  • No reasoning: Pattern matches instead of thinks
  • Repetition-prone: Open-ended generation loops on high-probability tokens
  • Hallucination: Confident wrong answers (it's trained to sound coherent, not correct)

Use Cases

  • Education: Demo how language models actually work under the hood
  • Local chat: Privacy-first inference on your device
  • Mobile app: Base model for custom fine-tuning or RAG
  • Research: Baseline for studying small model behavior and scaling

Running It

# With llama.cpp
llama serve -hf SmallAICreator/TinyGPT-60m:F32

# With Ollama
ollama run hf.co/SmallAICreator/TinyGPT-60m:F32

# With Python
from llama_cpp import Llama
llm = Llama.from_pretrained(
    repo_id="SmallAICreator/TinyGPT-60m",
    filename="tinygpt-sft-f32.gguf",
)
Downloads last month
11
GGUF
Model size
62.3M params
Architecture
gpt2
Hardware compatibility
Log In to add your hardware

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support