--- license: cc datasets: - Zilun/RS5M language: - en metrics: - accuracy - recall --- ## GeoRSCLIP Model * **GeoRSCLIP with ViT-B-32 and ViT-H-14 backbone** * **GeoRSCLIP-FT for retrieval** ### Installation * Install Pytorch following instructions from the official website (We tested in torch 2.0.1 with CUDA 11.8 and 2.1.0 with CUDA 12.1) ```bash pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118 ``` * Install other dependencies ```bash pip install pillow pandas scikit-learn ftfy tqdm matplotlib transformers adapter-transformers open_clip_torch pycocotools timm clip-benchmark torch-rs ``` ### Usage * Clone the repo from: https://huggingface.co/Zilun/GeoRSCLIP ```bash git clone https://huggingface.co/Zilun/GeoRSCLIP cd GeoRSCLIP ``` * Unzip the test data ```bash unzip data/rs5m_test_data.zip ``` * Run the inference script: ```bash python codebase/inference.py --ckpt-path /your/local/path/to/RS5M_ViT-B-32.pt --test-dataset-dir /your/local/path/to/rs5m_test_data ``` * (Optional) If you just want to load the GeoRSCLIP model: ```python import open_clip import torch from inference_tool import get_preprocess ckpt_path = "/your/local/path/to/RS5M_ViT-B-32.pt" model, _, _ = open_clip.create_model_and_transforms("ViT-B/32", pretrained="openai") checkpoint = torch.load(ckpt_path, map_location="cpu") msg = model.load_state_dict(checkpoint, strict=False) model = model.to("cuda") img_preprocess = get_preprocess( image_resolution=224, ) ``` ```python import open_clip import torch from inference_tool import get_preprocess ckpt_path = "/your/local/path/to/RS5M_ViT-H-14.pt" model, _, _ = open_clip.create_model_and_transforms("ViT-H/14", pretrained="laion2b_s32b_b79k") checkpoint = torch.load(ckpt_path, map_location="cpu") msg = model.load_state_dict(checkpoint, strict=False) model = model.to("cuda") img_preprocess = get_preprocess( image_resolution=224, ) ```