coreml-FastViT-MA36 / README.md
reach-vb's picture
reach-vb HF staff
Update README.md
4d12ff3 verified
|
raw
history blame
No virus
2.23 kB
metadata
tags:
  - image-classification
library_name: coreml
license: other
license_name: apple-ascl
license_link: LICENSE
datasets:
  - imagenet-1k

FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization

Please observe original license.

Model Details

Evaluation - Variants

Variant Parameters Size (MB) Weight precision Act. precision Δ Pytorch acc
T8 3.6M 7.8 Float16 Float16 -0.9%
MA36 42.7M 84 Float16 Float16 -0.06%

Evaluaition - Inference time

Variant Device OS Inference time (ms) Dominant compute unit
T8 iPhone 12 Pro Max 17.5 0.79 Neural Engine
T8 M3 Max 14.4 0.62 Neural Engine
MA36 iPhone 12 Pro Max 18.0 4.50 Neural Engine
MA36 M3 Max 15.0 2.99 Neural Engine

Citation

@inproceedings{vasufastvit2023,
  author = {Pavan Kumar Anasosalu Vasu and James Gabriel and Jeff Zhu and Oncel Tuzel and Anurag Ranjan},
  title = {FastViT:  A Fast Hybrid Vision Transformer using Structural Reparameterization},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year = {2023}
}