| [PolyCoder](https://github.com/VHellendoorn/Code-LMs) uses GPT2 architecture, with BPE tokenizer trained on a random 5% subset of the data (all languages), and a context length of 2048. To study the effect of scaling of model size, the odel was trained in 3 different sizes. | |
| <div align="center"> | |
| |Model | # parameters | | |
| | - | - | | |
| | GPT2 | 160M | | |
| | GPT2 | 400M | | |
| | GPT2 | 2.7B | | |
| </div> | |
| PolyCoder is currently being integrated in 🤗 `transformers`. Meanwhile it can be loaded following the instructions in the original GitHub [repo](https://github.com/vhellendoorn/code-lms#models). |