Papers
arxiv:2302.00275

Learning Generalized Zero-Shot Learners for Open-Domain Image Geolocalization

Published on Feb 1, 2023
Authors:
,
,

Abstract

Image geolocalization is the challenging task of predicting the geographic coordinates of origin for a given photo. It is an unsolved problem relying on the ability to combine visual clues with general knowledge about the world to make accurate predictions across geographies. We present https://huggingface.co/geolocal/StreetCLIP{StreetCLIP}, a robust, publicly available foundation model not only achieving state-of-the-art performance on multiple open-domain image geolocalization benchmarks but also doing so in a zero-shot setting, outperforming supervised models trained on more than 4 million images. Our method introduces a meta-learning approach for generalized zero-shot learning by pretraining CLIP from synthetic captions, grounding CLIP in a domain of choice. We show that our method effectively transfers CLIP's generalized zero-shot capabilities to the domain of image geolocalization, improving in-domain generalized zero-shot performance without finetuning StreetCLIP on a fixed set of classes.

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2302.00275 in a dataset README.md to link it from this page.

Spaces citing this paper 5

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.