NanoCode-GPT πŸš€

A custom Transformer Decoder-only language model trained from scratch for Python code generation. This model was built as an educational and experimental project, featuring modern architectural choices like Rotary Position Embeddings (RoPE) and SwiGLU activations.

Model Details

  • Architecture: Custom Transformer (Decoder-only)
  • Parameters: ~57.3M
  • Context Length: 512 tokens
  • Vocabulary Size: 32,000 (Custom SentencePiece BPE)
  • Format: Safetensors

Training Data

The model was trained on a curated dataset of ~20,000 high-quality code examples, including:

  • CodeAlpaca: Instruction-to-code examples.
  • CodeSearchNet: Real-world Python code with docstrings.
  • Synthetic Data: Curated algorithmic examples (sorting, searching, data structures).

How to Use

Since this is a custom architecture, it requires the original CodeGPT class definition and inference script to run. We recommend downloading the standalone .zip package from the original repository or using the provided inference code to load the Safetensors weights.

Limitations

This is a small-scale, experimental model (~57M parameters) trained for educational purposes. It is best suited for short, simple Python functions and algorithmic tasks. It may produce syntactically incorrect code or hallucinate logic on complex requests. Always review and test the generated code.

Downloads last month
32
Safetensors
Model size
60.4M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support