syedameeng's picture
Update README.md
5f31bde verified

A newer version of the Gradio SDK is available: 6.8.0

Upgrade
metadata
title: CurvOpt SmarterModels
emoji: 📊
colorFrom: red
colorTo: red
sdk: gradio
sdk_version: 6.6.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Smarter Models, Smaller Footprint

CurvOpt-LLM — Realtime Optimizer

Curvature-guided mixed-precision optimization for LLMs. No retraining required.

What This Does

  • Loads any HuggingFace causal LM
  • Computes Fisher diagonal curvature per layer (real gradients)
  • Assigns FP32 / FP16 / BF16 per layer based on sensitivity
  • Rewrites and saves a deployable optimized model (downloadable ZIP)
  • Reports electricity, CO₂, and water footprint savings

How to Use

  1. Select a model from the dropdown (or enter a custom HF model ID)
  2. Set calibration samples (1–32) and PPL tolerance
  3. Click Run Optimization
  4. Download the optimized model ZIP when done

Supported Models

OPT family · GPT-2 family · Pythia · Phi · BLOOM · Mistral · Llama-2 · Qwen · Falcon · and any AutoModelForCausalLM compatible model.

Research

Based on Fisher Information / Optimal Brain Damage curvature analysis. Novel contribution: per-request curvature-gated mixed precision with user intent feedback.

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference