File size: 4,535 Bytes
50704de
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
# Segment Anything 2.1 RKNN2

## (English README see below)

在RK3588上运行强大的Segment Anything 2.1图像分割模型!

- 推理速度(RK3588):
  - Encoder(Tiny)(单NPU核): 3s
  - Encoder(Small)(单NPU核): 3.5s
  - Encoder(Large)(单NPU核): 12s
  - Decoder(CPU): 0.1s

- 内存占用(RK3588): 
  - Encoder(Tiny): 0.95GB
  - Encoder(Small): 1.1GB
  - Encoder(Large): 4.1GB
  - Decoder: 非常小, 可以忽略不计

## 使用方法

1. 克隆或者下载此仓库到本地. 模型较大, 请确保有足够的磁盘空间.
   
2. 安装依赖

```bash
pip install numpy<2 pillow matplotlib opencv-python onnxruntime rknn-toolkit-lite2
```

3. 运行
   
```bash
python test_rknn.py
```

你可以修改`test_rknn.py`中这一部分
```python
def main():
    # 1. 加载原始图片
    path = "dog.jpg"
    orig_image, input_image, (scale, offset_x, offset_y) = load_image(path)
    decoder_path = "sam2.1_hiera_small_decoder.onnx"
    encoder_path = "sam2.1_hiera_small_encoder.rknn"
    ...
```

来测试不同的模型和图片. 注意, 和SAM1不同, 这里的encoder和decoder必须使用同一个版本的模型.


## 模型转换

1. 安装依赖

```bash
pip install numpy<2 onnxslim onnxruntime rknn-toolkit2 sam2
```

2. 下载SAM2.1的pt模型文件. 可以从[这里](https://github.com/facebookresearch/sam2?tab=readme-ov-file#model-description)下载.

3. 转换pt模型到onnx模型. 以Tiny模型为例:

```bash
python ./export_onnx.py --model_type sam2.1_hiera_tiny --checkpoint ./sam2.1_hiera_tiny.pt --output_encoder ./sam2.1_hiera_tiny_encoder.onnx --output_decoder sam2.1_hiera_tiny_decoder.onnx
```

4. 将onnx模型转换为rknn模型. 以Tiny模型为例:

```bash
python ./convert_rknn.py sam2.1_hiera_tiny
```
如果在常量折叠时报错, 请尝试更新onnxruntime到最新版本.

## 已知问题

- 只实现了图片分割, 没有实现视频分割.
- 由于RKNN-Toolkit2的问题, decoder模型在转换时会报错, 暂时需要使用CPU onnxruntime运行, 会略微增加CPU占用.

## 参考

- [samexporter/export_sam21_cvat.py](https://github.com/hashJoe/samexporter/blob/cvat/samexporter/export_sam21_cvat.py)
- [SAM 2](https://github.com/facebookresearch/sam2)

## English README

Run the powerful Segment Anything 2.1 image segmentation model on RK3588!

- Inference Speed (RK3588):
  - Encoder(Tiny)(Single NPU Core): 3s
  - Encoder(Small)(Single NPU Core): 3.5s
  - Encoder(Large)(Single NPU Core): 12s
  - Decoder(CPU): 0.1s

- Memory Usage (RK3588):
  - Encoder(Tiny): 0.95GB
  - Encoder(Small): 1.1GB
  - Encoder(Large): 4.1GB
  - Decoder: Negligible

## Usage

1. Clone or download this repository. Models are large, please ensure sufficient disk space.

2. Install dependencies

```bash
pip install numpy<2 pillow matplotlib opencv-python onnxruntime rknn-toolkit-lite2
```

3. Run

```bash
python test_rknn.py
```

You can modify this part in `test_rknn.py`
```python
def main():
    # 1. Load original image
    path = "dog.jpg"
    orig_image, input_image, (scale, offset_x, offset_y) = load_image(path)
    decoder_path = "sam2.1_hiera_small_decoder.onnx"
    encoder_path = "sam2.1_hiera_small_encoder.rknn"
    ...
```

to test different models and images. Note that unlike SAM1, the encoder and decoder must use the same version of the model.

## Model Conversion

1. Install dependencies

```bash
pip install numpy<2 onnxslim onnxruntime rknn-toolkit2 sam2
```

2. Download SAM2.1 pt model files. You can download them from [here](https://github.com/facebookresearch/sam2?tab=readme-ov-file#model-description).

3. Convert pt models to onnx models. Taking Tiny model as an example:

```bash
python ./export_onnx.py --model_type sam2.1_hiera_tiny --checkpoint ./sam2.1_hiera_tiny.pt --output_encoder ./sam2.1_hiera_tiny_encoder.onnx --output_decoder sam2.1_hiera_tiny_decoder.onnx
```

4. Convert onnx models to rknn models. Taking Tiny model as an example:

```bash
python ./convert_rknn.py sam2.1_hiera_tiny
```
If you encounter errors during constant folding, try updating onnxruntime to the latest version.

## Known Issues

- Only image segmentation is implemented, video segmentation is not supported.
- Due to issues with RKNN-Toolkit2, the decoder model conversion will fail. Currently, it needs to run on CPU using onnxruntime, which will slightly increase CPU usage.

## References

- [samexporter/export_sam21_cvat.py](https://github.com/hashJoe/samexporter/blob/cvat/samexporter/export_sam21_cvat.py)
- [SAM 2](https://github.com/facebookresearch/sam2)