RajBhope's picture
Upload README.md with huggingface_hub
2edd63e verified
metadata
tags:
  - gpu-runtime-prediction
  - code-understanding
  - regression
  - performance-modeling
datasets:
  - RajBhope/gpu-runtime-prediction-dataset
language:
  - code
library_name: scikit-learn
pipeline_tag: tabular-regression

GPU Runtime Predictor 🚀⚡

Predicts GPU kernel/operation runtime in milliseconds given source code + GPU hardware specifications.

How It Works

  1. Code Feature Extraction: Analyzes source code to extract 48 features (tensor dimensions, operation types, complexity indicators)
  2. GPU Feature Encoding: Uses 12 hardware specs (CUDA cores, memory bandwidth, compute capability, etc.)
  3. ML Prediction: Ensemble of Gradient Boosted Trees + Random Forest + Neural Network

Model Comparison

Model RMSE Spearman ρ MAPE %
GBR 0.9923 0.0728 0.9264 16.5%
RF 0.9924 0.0724 0.9277 16.3%
NN 0.9932 0.0687 0.9187 17.0%
Ensemble 0.9930 0.0693 0.9272 16.3%

GPU Catalog (12 GPUs)

GPU FP32 TFLOPS Memory BW VRAM
NVIDIA T4 8.1 320 GB/s 16 GB
NVIDIA V100 15.7 900 GB/s 32 GB
NVIDIA A10G 31.2 600 GB/s 24 GB
NVIDIA A100 40GB 19.5 1555 GB/s 40 GB
NVIDIA A100 80GB 19.5 2039 GB/s 80 GB
NVIDIA L4 30.3 300 GB/s 24 GB
NVIDIA L40S 91.6 864 GB/s 48 GB
NVIDIA RTX 3090 35.6 936 GB/s 24 GB
NVIDIA RTX 4090 82.6 1008 GB/s 24 GB
NVIDIA H100 SXM 67.0 3350 GB/s 80 GB
NVIDIA H100 PCIe 48.0 2039 GB/s 80 GB
NVIDIA RTX A6000 38.7 768 GB/s 48 GB

15 Supported Workload Types

matmul, conv2d, attention, transformer_block, linear, layernorm, batchnorm, softmax, embedding, elementwise, reduction, pooling, FFT, sort, loss+backward

Usage

# See the Gradio demo for interactive use
# Or load models directly:
import pickle
with open('model_gbr.pkl', 'rb') as f:
    model = pickle.load(f)

Training