This model is trained to classify app introduction images into three categories: Surrounded Screenshot
, Screenshot
, and Irrelevant
.
Code and dataset can be found at https://github.com/Jl-wei/guing
Using with pipeline
from PIL import Image
from transformers import pipeline
classifier = pipeline("image-classification", model="Jl-wei/app-intro-img-classifier", device=0)
image = Image.open(img_path)
result = classifier(image)
This is the app introduction image classifier of the following paper:
@misc{wei2024guing,
title={GUing: A Mobile GUI Search Engine using a Vision-Language Model},
author={Jialiang Wei and Anne-Lise Courbis and Thomas Lambolais and Binbin Xu and Pierre Louis Bernard and Gérard Dray and Walid Maalej},
year={2024},
eprint={2405.00145},
archivePrefix={arXiv}
}
Please note that the model can only be used for academic purpose.
- Downloads last month
- 10
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for Jl-wei/app-intro-img-classifier
Base model
google/vit-base-patch16-224-in21k