|
--- |
|
tags: |
|
- image-classification |
|
- timm |
|
- MobileNetV4 |
|
license: apache-2.0 |
|
datasets: |
|
- imagenet-1k |
|
pipeline_tag: image-classification |
|
--- |
|
# Model card for MobileNetV4_Conv_Large_TFLite_256 |
|
|
|
A MobileNet-V4 image classification model. Trained on ImageNet-1k by Ross Wightman. |
|
|
|
Converted to TFLite Float32 & Float16 formats by Youssef Boulaouane. |
|
|
|
|
|
## Model Details |
|
- **Pytorch Weights:** https://huggingface.co/timm/mobilenetv4_conv_large.e500_r256_in1k |
|
- **Model Type:** Image classification |
|
- **Model Stats:** |
|
- Params (M): 32.6 |
|
- GMACs: 2.9 |
|
- Activations (M): 12.1 |
|
- Input Shape (1, 256, 256, 3) |
|
- **Dataset:** ImageNet-1k |
|
- **Papers:** |
|
- MobileNetV4 -- Universal Models for the Mobile Ecosystem: https://arxiv.org/abs/2404.10518 |
|
- PyTorch Image Models: https://github.com/huggingface/pytorch-image-models |
|
- **Original:** https://github.com/tensorflow/models/tree/master/official/vision |
|
|
|
## Model Usage |
|
### Image Classification in Python |
|
```python |
|
import numpy as np |
|
import tensorflow as tf |
|
from PIL import Image |
|
|
|
# Load label file |
|
with open('imagenet_classes.txt', 'r') as file: |
|
lines = file.readlines() |
|
|
|
index_to_label = {index: line.strip() for index, line in enumerate(lines)} |
|
|
|
# Initialize interpreter and IO details |
|
tfl_model = tf.lite.Interpreter(model_path=tf_model_path) |
|
tfl_model.allocate_tensors() |
|
input_details = tfl_model.get_input_details() |
|
output_details = tfl_model.get_output_details() |
|
|
|
# Load and preprocess the image |
|
image = Image.open(image_path).resize((256, 256), Image.BICUBIC) |
|
|
|
image = np.array(image, dtype=np.float32) |
|
mean = np.array([0.485, 0.456, 0.406], dtype=np.float32) |
|
std = np.array([0.229, 0.224, 0.225], dtype=np.float32) |
|
image = (image / 255.0 - mean) / std |
|
|
|
image = np.expand_dims(image, axis=-1) |
|
image = np.rollaxis(image, 3) |
|
|
|
# Inference and postprocessing |
|
input = input_details[0] |
|
tfl_model.set_tensor(input["index"], image) |
|
tfl_model.invoke() |
|
|
|
tfl_output = tfl_model.get_tensor(output_details[0]["index"]) |
|
tfl_output_tensor = tf.convert_to_tensor(tfl_output) |
|
tfl_softmax_output = tf.nn.softmax(tfl_output_tensor, axis=1) |
|
|
|
tfl_top5_probs, tfl_top5_indices = tf.math.top_k(tfl_softmax_output, k=5) |
|
|
|
# Get the top5 class labels and probabilities |
|
tfl_probs_list = tfl_top5_probs[0].numpy().tolist() |
|
tfl_index_list = tfl_top5_indices[0].numpy().tolist() |
|
|
|
for index, prob in zip(tfl_index_list, tfl_probs_list): |
|
print(f"{index_to_label[index]}: {round(prob*100, 2)}%") |
|
``` |
|
|
|
### Deployment on Mobile |
|
Refer to guides available here: https://ai.google.dev/edge/lite/inference |
|
|
|
## Citation |
|
```bibtex |
|
@article{qin2024mobilenetv4, |
|
title={MobileNetV4-Universal Models for the Mobile Ecosystem}, |
|
author={Qin, Danfeng and Leichner, Chas and Delakis, Manolis and Fornoni, Marco and Luo, Shixin and Yang, Fan and Wang, Weijun and Banbury, Colby and Ye, Chengxi and Akin, Berkin and others}, |
|
journal={arXiv preprint arXiv:2404.10518}, |
|
year={2024} |
|
} |
|
``` |
|
```bibtex |
|
@misc{rw2019timm, |
|
author = {Ross Wightman}, |
|
title = {PyTorch Image Models}, |
|
year = {2019}, |
|
publisher = {GitHub}, |
|
journal = {GitHub repository}, |
|
doi = {10.5281/zenodo.4414861}, |
|
howpublished = {\url{https://github.com/huggingface/pytorch-image-models}} |
|
} |
|
``` |