qubvel-hf HF staff commited on
Commit
497e0ca
·
verified ·
1 Parent(s): 0b208f9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +135 -1
README.md CHANGED
@@ -1,3 +1,137 @@
1
  ---
2
  pipeline_tag: image-segmentation
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  pipeline_tag: image-segmentation
3
+ ---
4
+
5
+ <!---
6
+ Copyright 2024 The HuggingFace Team. All rights reserved.
7
+
8
+ Licensed under the Apache License, Version 2.0 (the "License");
9
+ you may not use this file except in compliance with the License.
10
+ You may obtain a copy of the License at
11
+
12
+ http://www.apache.org/licenses/LICENSE-2.0
13
+
14
+ Unless required by applicable law or agreed to in writing, software
15
+ distributed under the License is distributed on an "AS IS" BASIS,
16
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
17
+ See the License for the specific language governing permissions and
18
+ limitations under the License.
19
+ -->
20
+
21
+ # Instance Segmentation Example
22
+
23
+ Content:
24
+ - [PyTorch Version with Trainer](#pytorch-version-with-trainer)
25
+ - [Reload and Perform Inference](#reload-and-perform-inference)
26
+ - [Note on Custom Data](#note-on-custom-data)
27
+
28
+ ## PyTorch Version with Trainer
29
+
30
+ This model is based on the script [`run_instance_segmentation.py`](https://github.com/huggingface/transformers/blob/main/examples/pytorch/instance-segmentation/run_instance_segmentation.py).
31
+ The script uses the [🤗 Trainer API](https://huggingface.co/docs/transformers/main_classes/trainer) to manage training automatically, including distributed environments.
32
+ Here, we fine-tune a [Mask2Former](https://huggingface.co/docs/transformers/model_doc/mask2former) model on a subsample of the [ADE20K](https://huggingface.co/datasets/zhoubolei/scene_parse_150) dataset. We created a [small dataset](https://huggingface.co/datasets/qubvel-hf/ade20k-mini) with approximately 2,000 images containing only "person" and "car" annotations; all other pixels are marked as "background."
33
+
34
+ Here is the `label2id` mapping for this model:
35
+
36
+ ```python
37
+ label2id = {
38
+ "person": 0,
39
+ "car": 1,
40
+ }
41
+ ```
42
+
43
+ The training was done with the following command:
44
+
45
+ ```bash
46
+ python run_instance_segmentation.py \
47
+ --model_name_or_path facebook/mask2former-swin-tiny-coco-instance \
48
+ --output_dir finetune-instance-segmentation-ade20k-mini-mask2former \
49
+ --dataset_name qubvel-hf/ade20k-mini \
50
+ --do_reduce_labels \
51
+ --image_height 256 \
52
+ --image_width 256 \
53
+ --do_train \
54
+ --fp16 \
55
+ --num_train_epochs 40 \
56
+ --learning_rate 1e-5 \
57
+ --lr_scheduler_type constant \
58
+ --per_device_train_batch_size 8 \
59
+ --gradient_accumulation_steps 2 \
60
+ --dataloader_num_workers 8 \
61
+ --dataloader_persistent_workers \
62
+ --dataloader_prefetch_factor 4 \
63
+ --do_eval \
64
+ --evaluation_strategy epoch \
65
+ --logging_strategy epoch \
66
+ --save_strategy epoch \
67
+ --save_total_limit 2 \
68
+ --push_to_hub
69
+ ```
70
+
71
+ ## Reload and Perform Inference
72
+
73
+ You can easily load this trained model and perform inference as follows:
74
+
75
+ ```python
76
+ import torch
77
+ import requests
78
+ import matplotlib.pyplot as plt
79
+
80
+ from PIL import Image
81
+ from transformers import Mask2FormerForUniversalSegmentation, Mask2FormerImageProcessor
82
+
83
+ # Load image
84
+ image = Image.open(requests.get("http://farm4.staticflickr.com/3017/3071497290_31f0393363_z.jpg", stream=True).raw)
85
+
86
+ # Load model and image processor
87
+ device = "cuda"
88
+ checkpoint = "qubvel-hf/finetune-instance-segmentation-ade20k-mini-mask2former"
89
+
90
+ model = Mask2FormerForUniversalSegmentation.from_pretrained(checkpoint, device_map=device)
91
+ image_processor = Mask2FormerImageProcessor.from_pretrained(checkpoint)
92
+
93
+ # Run inference on image
94
+ inputs = image_processor(images=[image], return_tensors="pt").to(device)
95
+ with torch.no_grad():
96
+ outputs = model(**inputs)
97
+
98
+ # Post-process outputs
99
+ outputs = image_processor.post_process_instance_segmentation(outputs, target_sizes=[image.size[::-1]])
100
+
101
+ print("Mask shape: ", outputs[0]["segmentation"].shape)
102
+ print("Mask values: ", outputs[0]["segmentation"].unique())
103
+ for segment in outputs[0]["segments_info"]:
104
+ print("Segment: ", segment)
105
+ ```
106
+
107
+ ```
108
+ Mask shape: torch.Size([427, 640])
109
+ Mask values: tensor([-1., 0., 1., 2., 3., 4., 5., 6.])
110
+ Segment: {'id': 0, 'label_id': 0, 'was_fused': False, 'score': 0.946127}
111
+ Segment: {'id': 1, 'label_id': 1, 'was_fused': False, 'score': 0.961582}
112
+ Segment: {'id': 2, 'label_id': 1, 'was_fused': False, 'score': 0.968367}
113
+ Segment: {'id': 3, 'label_id': 1, 'was_fused': False, 'score': 0.819527}
114
+ Segment: {'id': 4, 'label_id': 1, 'was_fused': False, 'score': 0.655761}
115
+ Segment: {'id': 5, 'label_id': 1, 'was_fused': False, 'score': 0.531299}
116
+ Segment: {'id': 6, 'label_id': 1, 'was_fused': False, 'score': 0.929477}
117
+ ```
118
+
119
+ Use the following code to visualize the results:
120
+
121
+ ```python
122
+ import numpy as np
123
+ import matplotlib.pyplot as plt
124
+
125
+ segmentation = outputs[0]["segmentation"].numpy()
126
+
127
+ plt.figure(figsize=(10, 10))
128
+ plt.subplot(1, 2, 1)
129
+ plt.imshow(np.array(image))
130
+ plt.axis("off")
131
+ plt.subplot(1, 2, 2)
132
+ plt.imshow(segmentation)
133
+ plt.axis("off")
134
+ plt.show()
135
+ ```
136
+
137
+ ![Result](https://i.imgur.com/rZmaRjD.png)