File size: 6,053 Bytes

02a2f8b
acb094a
02a2f8b
acb094a
 
 
 
 
 
 
 
 
 
449332f
 
02a2f8b
acb094a
93d16b9
02a2f8b
acb094a
02a2f8b
 
 
 
5a12945
acb094a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f8bc24b
02a2f8b
 
 
58ade07
02a2f8b
acb094a
02a2f8b
acb094a
 
 
02a2f8b
acb094a
 
 
02a2f8b
acb094a
 
 
 
 
 
 
02a2f8b
acb094a
02a2f8b
acb094a
 
 
 
 
 
02a2f8b
acb094a
02a2f8b
acb094a
 
02a2f8b
acb094a
 
02a2f8b
acb094a
02a2f8b
acb094a
 
02a2f8b
acb094a
 
 
 
02a2f8b
acb094a
 
 
 
 
 
 
 
 
 
 
 
 
 
02a2f8b
acb094a
02a2f8b
 
 
acb094a
 
02a2f8b
acb094a
02a2f8b
acb094a
02a2f8b
acb094a

---
license: mit
library_name: transformers
widget:
- src: >-
    https://fema-cap-imagery.s3.amazonaws.com/Images/CAP_-_Flooding_Spring_2023/Source/IAWG_23-B-5061/A0005/D75_0793_DxO_PL6_P.jpg
- example_title: Example classification of flooded scene
pipeline_tag: image-classification
tags:
- LADI
- Aerial Imagery
- Disaster Response
- Emergency Management
datasets:
- MITLL/LADI-v2-dataset
---
# Model Card for MITLL/LADI-v2-classifier-large-reference
LADI-v2-classifier-large-reference is based on [microsoft/swinv2-large-patch4-window12to16-192to256-22kto1k-ft](https://huggingface.co/microsoft/swinv2-large-patch4-window12to16-192to256-22kto1k-ft) and fine-tuned on the [MITLL/LADI-v2-dataset](https://huggingface.co/datasets/MITLL/LADI-v2-dataset). LADI-v2-classifier is trained to identify labels of interest to disaster response managers from aerial images.

🔴 __IMPORTANT__ ❗🔴 This model is the 'reference' version of the model, which is trained on 80% of the 10,000 available images. It is provided to facilitate reproduction of our paper and is not intended to be used in deployment. For deployment, see the [MITLL/LADI-v2-classifier-small](https://huggingface.co/MITLL/LADI-v2-classifier-small) and [MITLL/LADI-v2-classifier-large](https://huggingface.co/MITLL/LADI-v2-classifier-large) models, which are trained on the full LADI v2 dataset (all splits).

## Model Details

### Model Description
The model architecture is based on swinv2 and fine-tuned on the LADI v2 dataset, which contains 10,000 aerial images labeled by volunteers from the Civil Air Patrol. The images are labeled using multi-label classification for the following classes:

- bridges_any
- buildings_any
- buildings_affected_or_greater
- buildings_minor_or_greater
- debris_any
- flooding_any
- flooding_structures
- roads_any
- roads_damage
- trees_any
- trees_damage
- water_any

This 'reference' model is trained only on the training split, which contains 8,000 images from 2015-2022. It is provided for the purpose of reproducing the results from the paper. The 'deploy' model is trained on the training, validation, and test sets, and contains 10,000 images from 2015-2023. We recommend that anyone who wishes to use this model in production use the main versions of the models [MITLL/LADI-v2-classifier-small](https://huggingface.co/MITLL/LADI-v2-classifier-small) and [MITLL/LADI-v2-classifier-large](https://huggingface.co/MITLL/LADI-v2-classifier-large).

- **Developed by:** Jeff Liu, Sam Scheele
- **Funded by:** Department of the Air Force under Air Force Contract No. FA8702-15-D-0001
- **License:** MIT
- **Finetuned from model:** [microsoft/swinv2-large-patch4-window12to16-192to256-22kto1k-ft](https://huggingface.co/microsoft/swinv2-large-patch4-window12to16-192to256-22kto1k-ft)

## How to Get Started with the Model

LADI-v2-classifier-large-reference is trained to identify features of interest to disaster response managers from aerial images. Use the code below to get started with the model.

The simplest way to perform inference is using the pipeline interface

```python
from transformers import pipeline
image_url = "https://fema-cap-imagery.s3.amazonaws.com/Images/CAP_-_Flooding_Spring_2023/Source/IAWG_23-B-5061/A0005/D75_0793_DxO_PL6_P.jpg"

pipe = pipeline(model="MITLL/LADI-v2-classifier-large-reference")
print(pipe(image_url))
```

```
[{'label': 'buildings_any', 'score': 0.9995228052139282},
 {'label': 'water_any', 'score': 0.9990286827087402},
 {'label': 'flooding_structures', 'score': 0.9974568486213684},
 {'label': 'roads_any', 'score': 0.9963797926902771},
 {'label': 'flooding_any', 'score': 0.9872690439224243}]
```

For finer-grained control, see below:

```python
from transformers import AutoImageProcessor, AutoModelForImageClassification
import torch
import requests
from PIL import Image
from io import BytesIO

image_url = "https://fema-cap-imagery.s3.amazonaws.com/Images/CAP_-_Flooding_Spring_2023/Source/IAWG_23-B-5061/A0005/D75_0793_DxO_PL6_P.jpg"

img_data = requests.get(image_url).content
img = Image.open(BytesIO(img_data))

processor = AutoImageProcessor.from_pretrained("MITLL/LADI-v2-classifier-large-reference")
model = AutoModelForImageClassification.from_pretrained("MITLL/LADI-v2-classifier-large-reference")

inputs = processor(img, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits

predictions = torch.sigmoid(logits).detach().numpy()[0]
labels = [(model.config.id2label[idx], predictions[idx]) for idx in range(len(predictions))]
print(labels)
```

```
[('bridges_any', 0.9697420597076416),
 ('buildings_any', 0.9995228052139282),
 ('buildings_affected_or_greater', 0.9863481521606445),
 ('buildings_minor_or_greater', 0.014774609357118607),
 ('debris_any', 0.00019898588652722538),
 ('flooding_any', 0.9872690439224243),
 ('flooding_structures', 0.9974568486213684),
 ('roads_any', 0.9963797926902771),
 ('roads_damage', 0.879313051700592),
 ('trees_any', 0.9782388210296631),
 ('trees_damage', 0.7547890543937683),
 ('water_any', 0.9990286827087402)]
```

## Citation

**BibTeX:**

```
```

Paper forthcoming - watch this space for details

---

DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.  
  
This material is based upon work supported by the Department of the Air Force under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Department of the Air Force.  
  
© 2024 Massachusetts Institute of Technology.  
  
The software/firmware is provided to you on an As-Is basis  
  
Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.