File size: 1,889 Bytes
3de2a79 7764ee8 3de2a79 7764ee8 3de2a79 7764ee8 3de2a79 7764ee8 3de2a79 7764ee8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
---
datasets:
- reasonseg
language: en
license: other
pipeline_tag: image-segmentation
library_name: transformers
tags:
- vision
- segmentation
---
# Seg-Zero-7B
This model is based on the paper [Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement](https://huggingface.co/papers/2503.06520). It uses a decoupled architecture with a reasoning model and a segmentation model. It's trained via reinforcement learning using GRPO without explicit reasoning data, leading to robust zero-shot generalization and emergent test-time reasoning.
Code: https://github.com/dvlab-research/Seg-Zero
## Description
This is a Seg-Zero-7B model. It introduces a decoupled architecture consisting of a reasoning model and a segmentation model. The reasoning model interprets user intentions, generates explicit reasoning chains, and produces positional prompts, which are subsequently used by the segmentation model to generate pixel-level masks.
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# load model
model = AutoModelForCausalLM.from_pretrained("Ricky06662/Seg-Zero-7B")
tokenizer = AutoTokenizer.from_pretrained("Ricky06662/Seg-Zero-7B")
```
## Installation
```bash
git clone https://github.com/dvlab-research/Seg-Zero.git
cd Seg-Zero
conda create -n seg_zero python=3.11
conda activate seg_zero
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1
pip install -e .
pip install sam2
pip install matplotlib
```
## Inference
```bash
python inference_scripts/infer.py
```
The default question is:
> "the unusual object in the image."
You will get the thinking process in the command line and the mask will be saved in the **inference_scripts** folder. You can also provide your own image_path and text:
```bash
python inference_scripts/infer.py --image_path "your_image_path" --text "your question text"
``` |