microsoft
/

BiomedCLIP-PubMedBERT_256-vit_base_patch16_224

Zero-Shot Image Classification

Model card Files Files and versions Community

naotous commited on Apr 6, 2023

Commit

0f77922

·

1 Parent(s): a3676a0

Edits from Tristan

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ library_name: open_clip
 # BiomedCLIP-PubMedBERT_256-vit_base_patch16_224
-[BiomedCLIP](https://aka.ms/biomedclip-paper) is a biomedical vision-language foundation model that is pretrained on [PMC-15M](https://aka.ms/biomedclip-paper) dataset using contrastive learning.
 It uses PubMedBERT as the text encoder and Vision Transformer as the image encoder, with domain-specific adaptations.
 It can perform various vision-language processing (VLP) tasks such as cross-modal retrieval, image classification, and visual question answering.
 BiomedCLIP establishes new state of the art in a wide range of standard datasets, and substantially outperforms prior VLP approaches:

 # BiomedCLIP-PubMedBERT_256-vit_base_patch16_224
+[BiomedCLIP](https://aka.ms/biomedclip-paper) is a biomedical vision-language foundation model that is pretrained on [PMC-15M](https://aka.ms/biomedclip-paper), a dataset of 15 million figure-caption pairs extracted from biomedical research articles in PubMed Central, using contrastive learning.
 It uses PubMedBERT as the text encoder and Vision Transformer as the image encoder, with domain-specific adaptations.
 It can perform various vision-language processing (VLP) tasks such as cross-modal retrieval, image classification, and visual question answering.
 BiomedCLIP establishes new state of the art in a wide range of standard datasets, and substantially outperforms prior VLP approaches: