# Model Zoo ## Common Settings - All COCO models were trained on `coco_2017_train` and evaluated on `coco_2017_val`. - All models were trained using distributed training. - Most models were trained with 50 epochs settings (~51 COCO epochs) with multi-step LR scheduler which is the common setting in DETR-like methods. ## COCO Object Detection Baselines Here we provides our pretrained baselines with **detrex**. And more pretrained weights will be released in the future version. We also provide our converted pretrained weights for the users which will be marked as `(converted)`. ### DETR
Name Backbone Pretrained Epochs box
AP
Download
DETR-R50 (converted) R-50 IN1k 500 42.0 model
DETR-R50-DC5 (converted) R-50 IN1k 500 43.4 model
DETR-R101 (converted) R-101 IN1k 500 43.5 model
DETR-R101-DC5 (converted) R-101 IN1k 500 44.9 model
### Deformable-DETR
Name Backbone Pretrained Epochs box
AP
Download
Deformable-DETR + Box Refinement R50 IN1k 50 47.0 model
Deformable-DETR + Box Refinement + Two Stage R50 IN1k 50 48.2 model
### Anchor-DETR
Name Backbone Pretrain Epochs box
AP
download
Anchor-DETR-R50 R-50 IN1k 50 41.9 model
Anchor-DETR-R50 (converted) R-50 IN1k 50 42.2 model
Anchor-DETR-R50-DC5 (converted) R-50 IN1k 50 44.2 model
Anchor-DETR-R101 (converted) R-101 IN1k 50 43.5 model
Anchor-DETR-R101-DC5 (converted) R-101 IN1k 50 45.1 model
### Conditional-DETR
Name Backbone Pretrain Epochs box
AP
download
Conditional-DETR-R50 R-50 IN1k 50 41.6 model
Conditional-DETR-R50-DC5 (converted) R-50-DC5 IN1k 50 43.8 model
Conditional-DETR-R101 (converted) R-101 IN1k 50 43.0 model
Conditional-DETR-R101-DC5 (converted) R-101-DC5 IN1k 50 45.1 model
### DAB-DETR
Name Backbone Pretrained Epochs box
AP
Download
DAB-DETR-R50 R50 IN1k 50 43.3 model
DAB-DETR-R50-3patterns (converted) R-50 IN1k 50 42.8 model
DAB-DETR-R50-DC5 (converted) R-50 IN1k 50 44.6 model
DAB-DETR-R50-DC5-3patterns (converted) R-50 IN1k 50 45.7 model
DAB-DETR-R101 R101 IN1k 50 44.0 model
DAB-DETR-R101-DC5 (converted) R-101 IN1k 50 45.7 model
DAB-DETR-Swin-T Swin-Tiny-224 IN1k 50 45.2 model
DAB-Deformable-DETR-R50 R50 IN1k 50 49.0 model
DAB-Deformable-DETR-R50-Two-Stage R50 IN1k 50 49.7 model
### DN-DETR
Name Backbone Pretrained Epochs box
AP
Download
DN-DETR-R50 R50 IN1k 50 44.7 model
DN-DETR-R50-DC5 (converted) R50 IN1k 50 46.3 model
### DINO **Pretrained DINO with ResNet Backbone**
Name Backbone Pretrained Epochs Denoising Queries box
AP
Download
DINO-R50-4scale R50 IN1k 12 100 49.2 model
DINO-R50-4scale (hacked trainer) R-50 IN1k 12 100 49.4 model
DINO-R50-4scale with EMA R-50 IN1k 12 100 49.4 model
DINO-R50-5scale R50 IN1k 12 100 49.6 model
DINO-R50-4scale R50 IN1k 12 300 49.5 model
DINO-R50-4scale R50 IN1k 24 100 50.6 model
DINO-R101-4scale R101 IN1k 12 100 50.0 model
**Pretrained DINO with Swin-Transformer Backbone**
Name Backbone Pretrained Epochs Denoising Queries box
AP
Download
DINO-Swin-T-224-4scale Swin-Tiny-224 IN1k 12 100 51.3 model
DINO-Swin-T-224-4scale Swin-Tiny-224 IN22k to IN1k 12 100 52.5 model
DINO-Swin-S-224-4scale Swin-Small-224 IN1k 12 100 53.0 model
DINO-Swin-B-384-4scale Swin-Base-384 IN22k to IN1k 12 100 55.8 model
DINO-Swin-L-224-4scale Swin-Large-224 IN22k to IN1k 12 100 56.9 model
DINO-Swin-L-384-4scale Swin-Large-384 IN22k to IN1k 12 100 56.9 model
DINO-Swin-L-384-5scale Swin-Large-384 IN22k to IN1k 12 100 57.5 model
DINO-Swin-L-384-4scale Swin-Large-384 IN22k to IN1k 36 100 58.1 model
DINO-Swin-L-384-5scale Swin-Large-384 IN22k to IN1k 36 100 58.5 model
**Pretrained DINO with FocalNet Backbone**
Name Backbone Pretrained Epochs Denoising Queries box
AP
Download
DINO-FocalNet-Large-4scale FocalNet-384-LRF-3Level IN22k 12 100 57.5 model
DINO-FocalNet-Large-4scale FocalNet-384-LRF-4Level IN22k 12 100 58.0 model
DINO-FocalNet-Large-5scale FocalNet-384-LRF-4Level IN22k 12 100 58.5 model
**Pretrained DINO with ViTDet Backbone**
Name Backbone Pretrained Epochs Denoising Queries box
AP
Download
DINO-ViTDet-Base-4scale ViT IN1k, MAE 12 100 50.2 model
DINO-ViTDet-Base-4scale ViT IN1k, MAE 50 100 55.0 model
DINO-ViTDet-Large-4scale ViT IN1k, MAE 12 100 52.9 model
DINO-ViTDet-Large-4scale ViT IN1k, MAE 50 100 57.5 model
### H-Deformable-DETR
Name Backbone Pretrained Query Epochs box
AP
Download
H-Deformable-DETR-R50 + tricks (detrex) R50 IN1k 300 12 49.1 model
H-Deformable-DETR-R50 + tricks (converted) R50 IN1k 300 12 48.9 model
H-Deformable-DETR-R50 + tricks (converted) R50 IN1k 300 36 50.3 model
H-Deformable-DETR-Swin-T + tricks (converted) Swin-Tiny IN1k 300 12 50.6 model
H-Deformable-DETR-Swin-T + tricks (converted) Swin-Tiny IN1k 300 36 53.5 model
H-Deformable-DETR-Swin-L + tricks (converted) Swin-Large IN22k 300 12 56.2 model
H-Deformable-DETR-Swin-L + tricks (converted) Swin-Large IN22k 300 36 57.5 model
H-Deformable-DETR-Swin-L + tricks (converted) Swin-Large IN22k 900 12 56.4 model
H-Deformable-DETR-Swin-L + tricks (converted) Swin-Large IN22k 300 36 57.5 model
### DETA
Name Backbone Pretrained Epochs box
AP
Download
Improved-Deformable-DETR-R50 (converted) R-50 IN1k 50 49.8 model
DETA-R50-5scale (bs=8, 180000 iterations) R-50 IN1k 12 50.0 model
DETA-R50-5scale (with hacked train engine) R-50 IN1k 12 49.9 model
DETA-R50-5scale-12ep (no frozen backbone) R-50 IN1k 12 50.2 model
DETA-R50-5scale (converted) R-50 IN1k 12 50.1 model
DETA-Swin-Large-finetune (converted) Swin-Large-384 Object 365 24 62.9 model