A newer version of the Gradio SDK is available:
6.8.0
metadata
title: CurvOpt SmarterModels
emoji: 📊
colorFrom: red
colorTo: red
sdk: gradio
sdk_version: 6.6.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Smarter Models, Smaller Footprint
CurvOpt-LLM — Realtime Optimizer
Curvature-guided mixed-precision optimization for LLMs. No retraining required.
What This Does
- Loads any HuggingFace causal LM
- Computes Fisher diagonal curvature per layer (real gradients)
- Assigns FP32 / FP16 / BF16 per layer based on sensitivity
- Rewrites and saves a deployable optimized model (downloadable ZIP)
- Reports electricity, CO₂, and water footprint savings
How to Use
- Select a model from the dropdown (or enter a custom HF model ID)
- Set calibration samples (1–32) and PPL tolerance
- Click Run Optimization
- Download the optimized model ZIP when done
Supported Models
OPT family · GPT-2 family · Pythia · Phi · BLOOM · Mistral · Llama-2 · Qwen · Falcon · and any AutoModelForCausalLM compatible model.
Research
Based on Fisher Information / Optimal Brain Damage curvature analysis. Novel contribution: per-request curvature-gated mixed precision with user intent feedback.
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference