# Segment Anything 2.1 RKNN2 ## (English README see below) 在RK3588上运行强大的Segment Anything 2.1图像分割模型! - 推理速度(RK3588): - Encoder(Tiny)(单NPU核): 3s - Encoder(Small)(单NPU核): 3.5s - Encoder(Large)(单NPU核): 12s - Decoder(CPU): 0.1s - 内存占用(RK3588): - Encoder(Tiny): 0.95GB - Encoder(Small): 1.1GB - Encoder(Large): 4.1GB - Decoder: 非常小, 可以忽略不计 ## 使用方法 1. 克隆或者下载此仓库到本地. 模型较大, 请确保有足够的磁盘空间. 2. 安装依赖 ```bash pip install numpy<2 pillow matplotlib opencv-python onnxruntime rknn-toolkit-lite2 ``` 3. 运行 ```bash python test_rknn.py ``` 你可以修改`test_rknn.py`中这一部分 ```python def main(): # 1. 加载原始图片 path = "dog.jpg" orig_image, input_image, (scale, offset_x, offset_y) = load_image(path) decoder_path = "sam2.1_hiera_small_decoder.onnx" encoder_path = "sam2.1_hiera_small_encoder.rknn" ... ``` 来测试不同的模型和图片. 注意, 和SAM1不同, 这里的encoder和decoder必须使用同一个版本的模型. ## 模型转换 1. 安装依赖 ```bash pip install numpy<2 onnxslim onnxruntime rknn-toolkit2 sam2 ``` 2. 下载SAM2.1的pt模型文件. 可以从[这里](https://github.com/facebookresearch/sam2?tab=readme-ov-file#model-description)下载. 3. 转换pt模型到onnx模型. 以Tiny模型为例: ```bash python ./export_onnx.py --model_type sam2.1_hiera_tiny --checkpoint ./sam2.1_hiera_tiny.pt --output_encoder ./sam2.1_hiera_tiny_encoder.onnx --output_decoder sam2.1_hiera_tiny_decoder.onnx ``` 4. 将onnx模型转换为rknn模型. 以Tiny模型为例: ```bash python ./convert_rknn.py sam2.1_hiera_tiny ``` 如果在常量折叠时报错, 请尝试更新onnxruntime到最新版本. ## 已知问题 - 只实现了图片分割, 没有实现视频分割. - 由于RKNN-Toolkit2的问题, decoder模型在转换时会报错, 暂时需要使用CPU onnxruntime运行, 会略微增加CPU占用. ## 参考 - [samexporter/export_sam21_cvat.py](https://github.com/hashJoe/samexporter/blob/cvat/samexporter/export_sam21_cvat.py) - [SAM 2](https://github.com/facebookresearch/sam2) ## English README Run the powerful Segment Anything 2.1 image segmentation model on RK3588! - Inference Speed (RK3588): - Encoder(Tiny)(Single NPU Core): 3s - Encoder(Small)(Single NPU Core): 3.5s - Encoder(Large)(Single NPU Core): 12s - Decoder(CPU): 0.1s - Memory Usage (RK3588): - Encoder(Tiny): 0.95GB - Encoder(Small): 1.1GB - Encoder(Large): 4.1GB - Decoder: Negligible ## Usage 1. Clone or download this repository. Models are large, please ensure sufficient disk space. 2. Install dependencies ```bash pip install numpy<2 pillow matplotlib opencv-python onnxruntime rknn-toolkit-lite2 ``` 3. Run ```bash python test_rknn.py ``` You can modify this part in `test_rknn.py` ```python def main(): # 1. Load original image path = "dog.jpg" orig_image, input_image, (scale, offset_x, offset_y) = load_image(path) decoder_path = "sam2.1_hiera_small_decoder.onnx" encoder_path = "sam2.1_hiera_small_encoder.rknn" ... ``` to test different models and images. Note that unlike SAM1, the encoder and decoder must use the same version of the model. ## Model Conversion 1. Install dependencies ```bash pip install numpy<2 onnxslim onnxruntime rknn-toolkit2 sam2 ``` 2. Download SAM2.1 pt model files. You can download them from [here](https://github.com/facebookresearch/sam2?tab=readme-ov-file#model-description). 3. Convert pt models to onnx models. Taking Tiny model as an example: ```bash python ./export_onnx.py --model_type sam2.1_hiera_tiny --checkpoint ./sam2.1_hiera_tiny.pt --output_encoder ./sam2.1_hiera_tiny_encoder.onnx --output_decoder sam2.1_hiera_tiny_decoder.onnx ``` 4. Convert onnx models to rknn models. Taking Tiny model as an example: ```bash python ./convert_rknn.py sam2.1_hiera_tiny ``` If you encounter errors during constant folding, try updating onnxruntime to the latest version. ## Known Issues - Only image segmentation is implemented, video segmentation is not supported. - Due to issues with RKNN-Toolkit2, the decoder model conversion will fail. Currently, it needs to run on CPU using onnxruntime, which will slightly increase CPU usage. ## References - [samexporter/export_sam21_cvat.py](https://github.com/hashJoe/samexporter/blob/cvat/samexporter/export_sam21_cvat.py) - [SAM 2](https://github.com/facebookresearch/sam2)