An example of small language learning model fine tuned for a domain-specific task (generating Python code).

Direct Use


from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Check if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

model_name = "jeff-vincent/distilgpt2-python-codegen"

# Load the tokenizer and model for causal language modeling
tokenizer = AutoTokenizer.from_pretrained(model_name)

# If the tokenizer doesn't already have a padding token, set it explicitly
if tokenizer.pad_token is None:
    tokenizer.add_special_tokens({'pad_token': '[PAD]'})  # Add a new pad token if none exists
    tokenizer.pad_token = tokenizer.eos_token  # Or use eos_token as pad_token

model = AutoModelForCausalLM.from_pretrained(model_name).to(device)
model.resize_token_embeddings(len(tokenizer))

# Input text
input_text = """

class Calculator:
    def __init__(self):
        self.result = None

    def add(self, a, b):
        self.result = a + b

    def subtract
"""
input_ids = tokenizer.encode(input_text, return_tensors="pt").to(device)

# Generate output (token IDs)
output_ids = model.generate(input_ids, max_length=200)

# Decode the generated token IDs into text
decoded_output = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(decoded_output)
Downloads last month
78
Safetensors
Model size
81.9M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for jeff-vincent/distilgpt2-python-codegen

Finetuned
(603)
this model

Dataset used to train jeff-vincent/distilgpt2-python-codegen