flaviagiammarino commited on
Commit
0199e15
1 Parent(s): 699e237

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -8
README.md CHANGED
@@ -17,15 +17,16 @@ PubMedCLIP is a fine-tuned version of [CLIP](https://huggingface.co/docs/transfo
17
  ## Model Description
18
  PubMedCLIP was trained on the [Radiology Objects in COntext (ROCO)](https://github.com/razorx89/roco-dataset) dataset, a large-scale multimodal medical imaging dataset.
19
  The ROCO dataset includes diverse imaging modalities (such as ultrasound, X-Ray, MRI, etc.) from various human body regions (such as head, neck, spine, etc.)
20
- captured from open-access [PubMed](https://pubmed.ncbi.nlm.nih.gov/) articles. The authors of PubMedCLIP have released three different pre-trained models at
 
 
21
  this [link](https://1drv.ms/u/s!ApXgPqe9kykTgwD4Np3-f7ODAot8?e=zLVlJ2) which use ResNet-50, ResNet-50x4 and ViT32 as image encoders.
22
- This repository includes only the ViT32 variant of the PubMedCLIP model.
23
 
24
  - **Repository:** [PubMedCLIP Official GitHub Repository](https://github.com/sarahESL/PubMedCLIP)
25
  - **Paper:** [Does CLIP Benefit Visual Question Answering in the Medical Domain as Much as it Does in the General Domain?](https://arxiv.org/abs/2112.13906)
26
- - **Dataset:** [Radiology Objects in COntext (ROCO)](https://github.com/razorx89/roco-dataset)
27
-
28
- ## Use
29
 
30
  ```python
31
  import requests
@@ -38,10 +39,11 @@ processor = CLIPProcessor.from_pretrained("flaviagiammarino/pubmed-clip-vit-base
38
 
39
  url = "https://d168r5mdg5gtkq.cloudfront.net/medpix/img/full/synpic9078.jpg"
40
  image = Image.open(requests.get(url, stream=True).raw)
41
- text = ["Chest X-Ray", "Brain MRI", "Abdominal CT Scan"]
42
 
43
- inputs = processor(text=text, images=image, return_tensors="pt", padding=True)
44
- probs = model(**inputs).logits_per_image.softmax(dim=1)
 
 
45
  ```
46
 
47
  ## Additional Information
 
17
  ## Model Description
18
  PubMedCLIP was trained on the [Radiology Objects in COntext (ROCO)](https://github.com/razorx89/roco-dataset) dataset, a large-scale multimodal medical imaging dataset.
19
  The ROCO dataset includes diverse imaging modalities (such as ultrasound, X-Ray, MRI, etc.) from various human body regions (such as head, neck, spine, etc.)
20
+ captured from open-access [PubMed](https://pubmed.ncbi.nlm.nih.gov/) articles.<br>
21
+
22
+ The authors of PubMedCLIP have released three different pre-trained models at
23
  this [link](https://1drv.ms/u/s!ApXgPqe9kykTgwD4Np3-f7ODAot8?e=zLVlJ2) which use ResNet-50, ResNet-50x4 and ViT32 as image encoders.
24
+ This repository includes only the ViT32 variant of the PubMedCLIP model.<br>
25
 
26
  - **Repository:** [PubMedCLIP Official GitHub Repository](https://github.com/sarahESL/PubMedCLIP)
27
  - **Paper:** [Does CLIP Benefit Visual Question Answering in the Medical Domain as Much as it Does in the General Domain?](https://arxiv.org/abs/2112.13906)
28
+
29
+ ## Use with Transformers
 
30
 
31
  ```python
32
  import requests
 
39
 
40
  url = "https://d168r5mdg5gtkq.cloudfront.net/medpix/img/full/synpic9078.jpg"
41
  image = Image.open(requests.get(url, stream=True).raw)
 
42
 
43
+ inputs = processor(text=["Chest X-Ray", "Brain MRI", "Abdominal CT Scan"], images=image, return_tensors="pt", padding=True)
44
+ outputs = model(**inputs)
45
+ logits_per_image = outputs.logits_per_image # this is the image-text similarity score
46
+ probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
47
  ```
48
 
49
  ## Additional Information