Spaces:
Running
on
T4
Running
on
T4
A newer version of the Gradio SDK is available:
5.6.0
Fine-tuning YOLO-World for Instance Segmentation
Models
We fine-tune YOLO-World on LVIS (LVIS-Base
) with mask annotations for open-vocabulary (zero-shot) instance segmentation.
We provide two fine-tuning strategies YOLO-World towards open-vocabulary instance segmentation:
fine-tuning
all modules
: leads to better LVIS segmentation accuracy but affects the zero-shot performance.fine-tuning the
segmentation head
: maintains the zero-shot performanc but lowers LVIS segmentation accuracy.
Model | Fine-tuning Data | Fine-tuning Modules | APmask | APr | APc | APf | Weights |
---|---|---|---|---|---|---|---|
YOLO-World-Seg-M | LVIS-Base |
all modules |
25.9 | 13.4 | 24.9 | 32.6 | HF Checkpoints π€ |
YOLO-World-v2-Seg-M | LVIS-Base |
all modules |
25.9 | 13.4 | 24.9 | 32.6 | HF Checkpoints π€ |
YOLO-World-Seg-L | LVIS-Base |
all modules |
28.7 | 15.0 | 28.3 | 35.2 | HF Checkpoints π€ |
YOLO-World-v2-Seg-L | LVIS-Base |
all modules |
28.7 | 15.0 | 28.3 | 35.2 | HF Checkpoints π€ |
YOLO-World-Seg-M | LVIS-Base |
seg head |
16.7 | 12.6 | 14.6 | 20.8 | HF Checkpoints π€ |
YOLO-World-v2-Seg-M | LVIS-Base |
seg head |
17.8 | 13.9 | 15.5 | 22.0 | HF Checkpoints π€ |
YOLO-World-Seg-L | LVIS-Base |
seg head |
19.1 | 14.2 | 17.2 | 23.5 | HF Checkpoints π€ |
YOLO-World-v2-Seg-L | LVIS-Base |
seg head |
19.8 | 17.2 | 17.5 | 23.6 | HF Checkpoints π€ |
NOTE: |
- The mask AP are evaluated on the LVIS
val 1.0
. - All models are fine-tuned for 80 epochs on
LVIS-Base
(866 categories,common + frequent
). - The YOLO-World-Seg with only
seg head
fine-tuned maintains the original zero-shot detection capability and segments objects.