VIT-CodeGPT CAD Code Generator
This model generates CADQuery Python code from images of 3D CAD objects. It uses a Vision Transformer (ViT) encoder and CodeGPT decoder in a vision-encoder-decoder architecture.
Model Details
- Architecture: Vision Encoder-Decoder (ViT + CodeGPT)
- Encoder: google/vit-base-patch16-224
- Decoder: microsoft/CodeGPT-small-py
- Task: Image-to-Code Generation (CAD)
- Dataset: CADCODER/GenCAD-Code
- Training Samples: 10,000 (8,500 train / 1,500 val)
- Training Time: ~4 hours 12 minutes
Training Configuration
- Batch Size: 4 (effective: 16 with gradient accumulation)
- Learning Rate: 3e-5
- Epochs: 3
- Max Length: 256 tokens
- Optimizer: AdamW with warmup
- Mixed Precision: FP16
Performance
Final training metrics:
- ROUGE-1: 0.0944
- ROUGE-2: 0.0040
- ROUGE-L: 0.0863
Usage
from transformers import VisionEncoderDecoderModel, ViTFeatureExtractor, AutoTokenizer
from PIL import Image
import torch
# Load the model
model = VisionEncoderDecoderModel.from_pretrained("Thehunter99/vit-codegpt-cadcoder")
feature_extractor = ViTFeatureExtractor.from_pretrained("google/vit-base-patch16-224")
tokenizer = AutoTokenizer.from_pretrained("microsoft/CodeGPT-small-py")
# Load and process image
image = Image.open("path/to/your/cad_image.png")
pixel_values = feature_extractor(images=image, return_tensors="pt").pixel_values
# Generate CAD code
with torch.no_grad():
generated_ids = model.generate(
pixel_values,
max_length=256,
num_beams=4,
early_stopping=True,
pad_token_id=tokenizer.eos_token_id
)
generated_code = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print(generated_code)
Example Output
Input: Image of a 3D cube Output:
import cadquery as cq
# Create a simple cube
result = cq.Workplane("XY").box(10, 10, 10)
Training Data
The model was trained on the CADCODER/GenCAD-Code dataset, which contains pairs of 3D CAD images and their corresponding CADQuery Python code.
Limitations
- Limited to CADQuery syntax
- Best performance on geometric shapes similar to training data
- May struggle with very complex or unusual CAD designs
- Maximum output length: 256 tokens
Citation
If you use this model, please cite:
@misc{vit-codegpt-cadcoder,
title={VIT-CodeGPT CAD Code Generator},
author={Your Name},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/Thehunter99/vit-codegpt-cadcoder}
}
- Downloads last month
- 13
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support