Instructions to use supernovayuli/smash-or-transformer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- timm
How to use supernovayuli/smash-or-transformer with timm:
import timm model = timm.create_model("hf_hub:supernovayuli/smash-or-transformer", pretrained=True) - Notebooks
- Google Colab
- Kaggle
Smash or Transformer -- ViT-Small checkpoints
Vision Transformers that predict a Pokemon's crowd "smash" fraction (0-100) from a single image -- i.e. how attractive the internet finds it. Trained on official artwork, in-game sprites, and Safebooru fan-art, with labels from aggregate votes on pokesmash.xyz.
Code, docs, and full reproduction guides: https://github.com/byrte1024/SmashOrTransformer
Checkpoints
timm ViT-Small/16 @ 224 + a scalar regression head, fine-tuned with
soft-label BCE. Each .pt holds model_state + config + metrics.
| File | Sources | Spearman (all_avg) | Notes |
|---|---|---|---|
vit_small_mixed_v1.pt |
portrait + in-game + booru | 0.770 | recommended |
vit_small_portraits_v1.pt |
portrait + in-game | 0.690 | sprite-only baseline |
vit_small_mixed_v2.pt |
+ heavy booru aug | 0.734 | deprecated (regression) |
Spearman is a fair cross-evaluation on a common held-out set of 102 Pokemon;
*.calibration.json are the isotonic calibration maps (mixed_v2 has none).
Usage
git clone https://github.com/byrte1024/SmashOrTransformer && cd SmashOrTransformer
uv sync
uv run python download_models.py # fetches vit_small_mixed_v1 into runs/
uv run python -m model.infer --checkpoint runs/vit_small_mixed_v1/checkpoints/best.pt img.png
Dataset: supernovayuli/smash-or-transformer-data
License
other -- the training data includes third-party fan-art and official assets
that are not ours to relicense. Weights are provided for research use.
- Downloads last month
- -