architectures/codeparrot.md · codeparrot/code-generation-models at main

CodeParrot uses GPT-2 architecture with BPE tokenizer trained on Python code from the training split of the data, and a context length of 1024. This model was released as an educational tool for training large language models from scratch on code, with detailed tutorials and descriptions of the training process. It makes use of 🤗 accelerate for distributed training and mixed precision. See this blog and repo for more details.

Model	# parameters
codeparrot-small	110M
codeparrot	1.5B

You can load the model and tokenizer directly from 🤗 transformers:

from transformers import AutoTokenizer, AutoModelWithLMHead
  
tokenizer = AutoTokenizer.from_pretrained("codeparrot/codeparrot")
model = AutoModelWithLMHead.from_pretrained("codeparrot/codeparrot")

inputs = tokenizer("def hello_world():", return_tensors="pt")
outputs = model(**inputs)

You can also use pipeline to generate code:

from transformers import pipeline

pipe = pipeline("text-generation", model="codeparrot/codeparrot")
outputs = pipe("def hello_world():")