Edit model card

gpt2-medium-finetuned-contract-gen

Overview

gpt2-medium-finetuned-contract-gen is a model specialized in generating Solidity contract codes. Derived from the gpt2-medium model by Hugging Face, it's been meticulously trained on an extensive set of Solidity contracts and patterns, making it apt for assisting in drafting or suggesting contract structures.

Model Description

This model has been designed specifically for generating Solidity contracts. Being a derivative of the gpt2-medium model, it retains the broader capabilities of the parent model while demonstrating a keen proficiency in understanding and generating Solidity-centric texts.

Performance

The model reported a loss of 0.3127 on the evaluation set.

Intended Uses & Limitations

Intended Uses:

  1. Assist developers by auto-generating contract code snippets based on prompts.
  2. Help in understanding and drafting complex contract structures.

Limitations:

  1. The generated code must be reviewed for security and functional correctness.
  2. The clarity of the generated code largely depends on the specificity of the prompt.

Training Details

Dataset

The model was fine-tuned on an undisclosed dataset comprised of a range of Solidity contracts.

Training Hyperparameters:

  • Learning Rate: 5e-05
  • Train Batch Size: 4
  • Evaluation Batch Size: 4
  • Seed: 42
  • Optimizer: Adam (betas=(0.9,0.999), epsilon=1e-08)
  • Learning Rate Scheduler: Cosine with restarts
  • Warmup Steps: 241
  • Epochs: 4

Training Results:

Training Loss Epoch Step Validation Loss
0.4744 0.21 1000 0.4736
0.467 0.41 2000 0.4146
0.4089 0.62 3000 0.3852
0.4018 0.83 4000 0.3688
0.3475 1.04 5000 0.3523
0.2751 1.24 6000 0.3434
0.2966 1.45 7000 0.3334
0.292 1.66 8000 0.3230
0.2899 1.87 9000 0.3200
0.2508 2.07 10000 0.3164
0.28 2.28 11000 0.3127

Dependencies:

  • Transformers: 4.31.0
  • Pytorch: 2.0.1+cu118
  • Datasets: 2.14.2
  • Tokenizers: 0.13.3

How to Use

If you wish to use this model to generate Solidity contract code, follow the steps below:

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("ckandemir/gpt2-medium-finetuned-contract-gen")
model = AutoModelForCausalLM.from_pretrained("ckandemir/gpt2-medium-finetuned-contract-gen")

# Input your code prompt
input_text = "contract MyToken"
input_ids = tokenizer.encode(input_text, return_tensors='pt')
sample_output = model.generate(input_ids, do_sample=True, max_length=400, num_return_sequences=1, temperature=0.7)

# Decode and print the generated text
generated_text = tokenizer.decode(sample_output[0], skip_special_tokens=True)
print(generated_text)
Downloads last month
9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from