Image Classification
timm
PyTorch
Safetensors
File size: 2,630 Bytes
35ab82f
 
 
 
 
 
8e779e5
 
 
 
35ab82f
8e779e5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
---
tags:
- image-classification
- timm
library_name: timm
license: apache-2.0
datasets:
- imagenet-1k
metrics:
- accuracy
---
# Model card for hpx_former_s18

The model hpx_former_s18 is part of the HyenaPixel model family proposed in the paper ["HyenaPixel: Global Image Context with Convolutions"](https://arxiv.org/abs/2402.19305). 
HyenaPixel uses large convolutions as an attention replacement by extending Hyena ([Paper](https://arxiv.org/abs/2302.10866) and [GitHub](https://github.com/HazyResearch/safari/)) to support bidirectrional and two-dimensional input. 
The operator is integrated in the MetaFormer ([Paper](https://arxiv.org/abs/2210.13452) and [GitHub](https://github.com/sail-sg/metaformer)) framework.

The official PyTorch implementation of HyenaPixel can be found on [GitHub](https://github.com/spravil/HyenaPixel).

## Models

| Model              | Resolution | Params | Top1 Acc |                                   Download                                   |
| :----------------- | :--------: | :----: | :------: | :--------------------------------------------------------------------------: |
| hpx_former_s18     |    224     |  29M   |   83.2   |   [HuggingFace](https://huggingface.co/Spravil/hpx_former_s18.westai_in1k)   |
| hpx_former_s18_384 |    384     |  29M   |   84.7   | [HuggingFace](https://huggingface.co/Spravil/hpx_former_s18.westai_in1k_384) |
| hb_former_s18      |    224     |  28M   |   83.5   |   [HuggingFace](https://huggingface.co/Spravil/hb_former_s18.westai_in1k)    |
| c_hpx_former_s18   |    224     |  28M   |   83.0   |  [HuggingFace](https://huggingface.co/Spravil/c_hpx_former_s18.westai_in1k)  |
| hpx_a_former_s18   |    224     |  28M   |   83.6   |  [HuggingFace](https://huggingface.co/Spravil/hpx_a_former_s18.westai_in1k)  |
| hb_a_former_s18    |    224     |  27M   |   83.2   |  [HuggingFace](https://huggingface.co/Spravil/hb_a_former_s18.westai_in1k)   |
| hpx_former_b36     |    224     |  111M  |   84.9   |   [HuggingFace](https://huggingface.co/Spravil/hpx_former_b36.westai_in1k)   |
| hb_former_b36      |    224     |  102M  |   85.2   |   [HuggingFace](https://huggingface.co/Spravil/hb_former_b36.westai_in1k)    |

## Usage

```
pip install git+https://github.com/spravil/HyenaPixel.git
```

```python
import timm
import hyenapixel.models

model = timm.create_model("hpx_former_s18", pretrained=True)
```

# Bibtex

```
@article{spravil2024hyenapixel,
  title={HyenaPixel: Global Image Context with Convolutions},
  author={Julian Spravil and Sebastian Houben and Sven Behnke},
  journal={arXiv preprint arXiv:2402.19305},
  year={2024},
}
```