---
license: mit
datasets:
- conceptual_captions
- sbu_captions
- visual_genome
language:
- en
tags:
- BridgeTower
---
Model weights for AAAI 2023 Oral Paper: [BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning](https://arxiv.org/abs/2206.08657).

Additional materials: [Code](https://github.com/microsoft/BridgeTower), [Slides](https://looperxx.github.io/files/BridgeTower-AAAI23-PPT-2023-02-08.pdf), [Video(EN)](https://youtu.be/VoHS6RB9LIg), [Video(CN)](https://www.bilibili.com/video/BV1sT411d7Cr), [Blog(CN)](http://looperxx.github.io/blog/BridgeTower), [Tweet(EN)](https://twitter.com/looperxx27/status/1621862912422993921).

BridgeTower has also been integrated into [Transformers](https://github.com/huggingface/transformers/).
- [Model Hub](https://huggingface.co/BridgeTower), [Code](https://github.com/huggingface/transformers/tree/main/src/transformers/models/bridgetower) and [Documentation](https://huggingface.co/docs/transformers/main/en/model_doc/bridgetower) are available.