Pico Language Model

university

AI & ML interests

None defined yet.

🎯 Pico: Tiny Language Models for Learning Dynamics Research

Pico is a framework for training and analyzing small language models, designed with clarity and educational purposes in mind. Built on a LLAMA-style architecture, Pico makes it easy to experiment with and understand transformer-based language models.

🔑 Key Features

  • Simple Architecture: Clean, modular implementation of core transformer components
  • Educational Focus: Well-documented code with clear references to academic papers
  • Research Ready: Built-in tools for analyzing model learning dynamics
  • Efficient Training: Pre-tokenized dataset and optimized training loop
  • Modern Stack: Built with PyTorch Lightning, Wandb, and HuggingFace integrations

🏗️ Core Components

  • RMSNorm for stable layer normalization
  • Rotary Positional Embeddings (RoPE) for position encoding
  • Multi-head attention with KV-cache support
  • SwiGLU activation function
  • Residual connections throughout

📚 References

Our implementation draws inspiration from and builds upon:

🤝 Contributing

We welcome contributions! Whether it's:

  • Adding new features
  • Improving documentation
  • Fixing bugs
  • Sharing experimental results

📝 License

Apache 2.0 License

📫 Contact