---
license: apache-2.0
pipeline_tag: any-to-any
---
This repository contains the models of the paper [Generalized Decoding for Pixel, Image, and Language](https://huggingface.co/papers/2212.11270).
Github: https://github.com/microsoft/X-Decoder
***Click to Download!***
## -> Models
*Focal-T:*
[xdecoder_focalt_last_novg.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focalt_last_novg.pt)
[xdecoder_focalt_last.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focalt_last.pt)
[xdecoder_focalt_best_openseg.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focalt_best_openseg.pt)
*Focal-L:*
[xdecoder_focall_last.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focall_last.pt)
[xdecoder_focall_bestseg.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focall_bestseg.pt)
## -> Datasets
[caption_class_similarity.pth](https://huggingface.co/xdecoder/X-Decoder/resolve/main/caption_class_similarity.pth)
[captions_train2017_filtrefgumdval_filtvlp.json](https://huggingface.co/xdecoder/X-Decoder/resolve/main/captions_train2017_filtrefgumdval_filtvlp.json)
[grounding_train2017_filtrefgumdval_filtvlp.json](https://huggingface.co/xdecoder/X-Decoder/resolve/main/grounding_train2017_filtrefgumdval_filtvlp.json)
[panoptic_train2017_filtrefgumdval_filtvlp.json](https://huggingface.co/xdecoder/X-Decoder/resolve/main/panoptic_train2017_filtrefgumdval_filtvlp.json)
[refcocog_umd_val.json](https://huggingface.co/xdecoder/X-Decoder/resolve/main/refcocog_umd_val.json)
## -> Evaluations
[coco_caption.zip](https://huggingface.co/xdecoder/X-Decoder/resolve/main/coco_caption.zip)