Image Classification
timm
PyTorch
Safetensors
Edit model card

Model card for hpx_former_b36

The model hpx_former_b36 is part of the HyenaPixel model family proposed in the paper "HyenaPixel: Global Image Context with Convolutions". HyenaPixel uses large convolutions as an attention replacement by extending Hyena (Paper and GitHub) to support bidirectrional and two-dimensional input. The operator is integrated in the MetaFormer (Paper and GitHub) framework.

The official PyTorch implementation of HyenaPixel can be found on GitHub.

Models

Model Resolution Params Top1 Acc Download
hpx_former_s18 224 29M 83.2 HuggingFace
hpx_former_s18_384 384 29M 84.7 HuggingFace
hb_former_s18 224 28M 83.5 HuggingFace
c_hpx_former_s18 224 28M 83.0 HuggingFace
hpx_a_former_s18 224 28M 83.6 HuggingFace
hb_a_former_s18 224 27M 83.2 HuggingFace
hpx_former_b36 224 111M 84.9 HuggingFace
hb_former_b36 224 102M 85.2 HuggingFace

Usage

pip install git+https://github.com/spravil/HyenaPixel.git
import timm
import hyenapixel.models

model = timm.create_model("hpx_former_b36", pretrained=True)

Bibtex

@article{spravil2024hyenapixel,
  title={HyenaPixel: Global Image Context with Convolutions},
  author={Julian Spravil and Sebastian Houben and Sven Behnke},
  journal={arXiv preprint arXiv:2402.19305},
  year={2024},
}
Downloads last month
155
Safetensors
Model size
105M params
Tensor type
F32
·

Dataset used to train Spravil/hpx_former_b36.westai_in1k

Collection including Spravil/hpx_former_b36.westai_in1k