vit.cpp / README.md
staghado's picture
Update README.md
eb7a4fe
---
license: mit
tags:
- 'vit '
- image classification
- ggml
---
# Vision Transformer (ViT) models for image classification converted to ggml format
[Available models](https://github.com/staghado/vit.cpp)
| Model | Disk | Mem | SHA |
| --- | --- | --- | --- |
| tiny | 12 MB | ~20 MB | `25ce65ff60e08a1a5b486685b533d79718e74c0f` |
| small | 45 MB | ~52 MB | `7a9f85340bd1a3dcd4275f46d5ee1db66649700e` |
| base | 174 MB | ~179 MB | `a10d29628977fe27691edf55b7238f899b8c02eb` |
| large | 610 MB | ~597 MB | `5f27087930f21987050188f9dc9eea75ac607214` |
The models are pre-trained on ImageNet21k then finetuned on ImageNet1k
with a patch size of 16 and an image size of 224.
For more information, visit:
https://github.com/staghado/vit.cpp