Upload README.md with huggingface_hub

2edd63e verified 15 days ago

2.42 kB

tags:
  - gpu-runtime-prediction
  - code-understanding
  - regression
  - performance-modeling
datasets:
  - RajBhope/gpu-runtime-prediction-dataset
language:
  - code
library_name: scikit-learn
pipeline_tag: tabular-regression

GPU Runtime Predictor 🚀⚡

Predicts GPU kernel/operation runtime in milliseconds given source code + GPU hardware specifications.

How It Works

Code Feature Extraction: Analyzes source code to extract 48 features (tensor dimensions, operation types, complexity indicators)
GPU Feature Encoding: Uses 12 hardware specs (CUDA cores, memory bandwidth, compute capability, etc.)
ML Prediction: Ensemble of Gradient Boosted Trees + Random Forest + Neural Network

Model Comparison

Model	R²	RMSE	Spearman ρ	MAPE %
GBR	0.9923	0.0728	0.9264	16.5%
RF	0.9924	0.0724	0.9277	16.3%
NN	0.9932	0.0687	0.9187	17.0%
Ensemble	0.9930	0.0693	0.9272	16.3%

GPU Catalog (12 GPUs)

GPU	FP32 TFLOPS	Memory BW	VRAM
NVIDIA T4	8.1	320 GB/s	16 GB
NVIDIA V100	15.7	900 GB/s	32 GB
NVIDIA A10G	31.2	600 GB/s	24 GB
NVIDIA A100 40GB	19.5	1555 GB/s	40 GB
NVIDIA A100 80GB	19.5	2039 GB/s	80 GB
NVIDIA L4	30.3	300 GB/s	24 GB
NVIDIA L40S	91.6	864 GB/s	48 GB
NVIDIA RTX 3090	35.6	936 GB/s	24 GB
NVIDIA RTX 4090	82.6	1008 GB/s	24 GB
NVIDIA H100 SXM	67.0	3350 GB/s	80 GB
NVIDIA H100 PCIe	48.0	2039 GB/s	80 GB
NVIDIA RTX A6000	38.7	768 GB/s	48 GB

15 Supported Workload Types

matmul, conv2d, attention, transformer_block, linear, layernorm, batchnorm, softmax, embedding, elementwise, reduction, pooling, FFT, sort, loss+backward

Usage

# See the Gradio demo for interactive use
# Or load models directly:
import pickle
with open('model_gbr.pkl', 'rb') as f:
    model = pickle.load(f)

Training

Dataset: RajBhope/gpu-runtime-prediction-dataset
51,900 samples = 4,325 workloads × 12 GPUs
Runtime generated via physics-based roofline performance model
Based on research from Regression Language Models and HELP