|
# Segment Anything 2.1 RKNN2 |
|
|
|
## (English README see below) |
|
|
|
在RK3588上运行强大的Segment Anything 2.1图像分割模型! |
|
|
|
- 推理速度(RK3588): |
|
- Encoder(Tiny)(单NPU核): 3s |
|
- Encoder(Small)(单NPU核): 3.5s |
|
- Encoder(Large)(单NPU核): 12s |
|
- Decoder(CPU): 0.1s |
|
|
|
- 内存占用(RK3588): |
|
- Encoder(Tiny): 0.95GB |
|
- Encoder(Small): 1.1GB |
|
- Encoder(Large): 4.1GB |
|
- Decoder: 非常小, 可以忽略不计 |
|
|
|
## 使用方法 |
|
|
|
1. 克隆或者下载此仓库到本地. 模型较大, 请确保有足够的磁盘空间. |
|
|
|
2. 安装依赖 |
|
|
|
```bash |
|
pip install numpy<2 pillow matplotlib opencv-python onnxruntime rknn-toolkit-lite2 |
|
``` |
|
|
|
3. 运行 |
|
|
|
```bash |
|
python test_rknn.py |
|
``` |
|
|
|
你可以修改`test_rknn.py`中这一部分 |
|
```python |
|
def main(): |
|
# 1. 加载原始图片 |
|
path = "dog.jpg" |
|
orig_image, input_image, (scale, offset_x, offset_y) = load_image(path) |
|
decoder_path = "sam2.1_hiera_small_decoder.onnx" |
|
encoder_path = "sam2.1_hiera_small_encoder.rknn" |
|
... |
|
``` |
|
|
|
来测试不同的模型和图片. 注意, 和SAM1不同, 这里的encoder和decoder必须使用同一个版本的模型. |
|
|
|
|
|
## 模型转换 |
|
|
|
1. 安装依赖 |
|
|
|
```bash |
|
pip install numpy<2 onnxslim onnxruntime rknn-toolkit2 sam2 |
|
``` |
|
|
|
2. 下载SAM2.1的pt模型文件. 可以从[这里](https://github.com/facebookresearch/sam2?tab=readme-ov-file#model-description)下载. |
|
|
|
3. 转换pt模型到onnx模型. 以Tiny模型为例: |
|
|
|
```bash |
|
python ./export_onnx.py --model_type sam2.1_hiera_tiny --checkpoint ./sam2.1_hiera_tiny.pt --output_encoder ./sam2.1_hiera_tiny_encoder.onnx --output_decoder sam2.1_hiera_tiny_decoder.onnx |
|
``` |
|
|
|
4. 将onnx模型转换为rknn模型. 以Tiny模型为例: |
|
|
|
```bash |
|
python ./convert_rknn.py sam2.1_hiera_tiny |
|
``` |
|
如果在常量折叠时报错, 请尝试更新onnxruntime到最新版本. |
|
|
|
## 已知问题 |
|
|
|
- 只实现了图片分割, 没有实现视频分割. |
|
- 由于RKNN-Toolkit2的问题, decoder模型在转换时会报错, 暂时需要使用CPU onnxruntime运行, 会略微增加CPU占用. |
|
|
|
## 参考 |
|
|
|
- [samexporter/export_sam21_cvat.py](https://github.com/hashJoe/samexporter/blob/cvat/samexporter/export_sam21_cvat.py) |
|
- [SAM 2](https://github.com/facebookresearch/sam2) |
|
|
|
## English README |
|
|
|
Run the powerful Segment Anything 2.1 image segmentation model on RK3588! |
|
|
|
- Inference Speed (RK3588): |
|
- Encoder(Tiny)(Single NPU Core): 3s |
|
- Encoder(Small)(Single NPU Core): 3.5s |
|
- Encoder(Large)(Single NPU Core): 12s |
|
- Decoder(CPU): 0.1s |
|
|
|
- Memory Usage (RK3588): |
|
- Encoder(Tiny): 0.95GB |
|
- Encoder(Small): 1.1GB |
|
- Encoder(Large): 4.1GB |
|
- Decoder: Negligible |
|
|
|
## Usage |
|
|
|
1. Clone or download this repository. Models are large, please ensure sufficient disk space. |
|
|
|
2. Install dependencies |
|
|
|
```bash |
|
pip install numpy<2 pillow matplotlib opencv-python onnxruntime rknn-toolkit-lite2 |
|
``` |
|
|
|
3. Run |
|
|
|
```bash |
|
python test_rknn.py |
|
``` |
|
|
|
You can modify this part in `test_rknn.py` |
|
```python |
|
def main(): |
|
# 1. Load original image |
|
path = "dog.jpg" |
|
orig_image, input_image, (scale, offset_x, offset_y) = load_image(path) |
|
decoder_path = "sam2.1_hiera_small_decoder.onnx" |
|
encoder_path = "sam2.1_hiera_small_encoder.rknn" |
|
... |
|
``` |
|
|
|
to test different models and images. Note that unlike SAM1, the encoder and decoder must use the same version of the model. |
|
|
|
## Model Conversion |
|
|
|
1. Install dependencies |
|
|
|
```bash |
|
pip install numpy<2 onnxslim onnxruntime rknn-toolkit2 sam2 |
|
``` |
|
|
|
2. Download SAM2.1 pt model files. You can download them from [here](https://github.com/facebookresearch/sam2?tab=readme-ov-file#model-description). |
|
|
|
3. Convert pt models to onnx models. Taking Tiny model as an example: |
|
|
|
```bash |
|
python ./export_onnx.py --model_type sam2.1_hiera_tiny --checkpoint ./sam2.1_hiera_tiny.pt --output_encoder ./sam2.1_hiera_tiny_encoder.onnx --output_decoder sam2.1_hiera_tiny_decoder.onnx |
|
``` |
|
|
|
4. Convert onnx models to rknn models. Taking Tiny model as an example: |
|
|
|
```bash |
|
python ./convert_rknn.py sam2.1_hiera_tiny |
|
``` |
|
If you encounter errors during constant folding, try updating onnxruntime to the latest version. |
|
|
|
## Known Issues |
|
|
|
- Only image segmentation is implemented, video segmentation is not supported. |
|
- Due to issues with RKNN-Toolkit2, the decoder model conversion will fail. Currently, it needs to run on CPU using onnxruntime, which will slightly increase CPU usage. |
|
|
|
## References |
|
|
|
- [samexporter/export_sam21_cvat.py](https://github.com/hashJoe/samexporter/blob/cvat/samexporter/export_sam21_cvat.py) |
|
- [SAM 2](https://github.com/facebookresearch/sam2) |
|
|