amphora commited on
Commit
8b6d3c7
1 Parent(s): 42c971d

chore: added explanation

Browse files
Files changed (1) hide show
  1. image2text.py +4 -3
image2text.py CHANGED
@@ -13,9 +13,10 @@ def app(model_name):
13
  st.title("Zero-shot Image Classification")
14
  st.markdown(
15
  """
16
- This demonstration explores capability of KoCLIP in the field of Zero-Shot Prediction. This demo takes a set of image and captions from, and predicts the most likely label among the different captions given.
17
- KoCLIP is a retraining of OpenAI's CLIP model using 82,783 images from MSCOCO dataset and Korean caption annotations. Korean translation of caption annotations were obtained from AI Hub. Base model koclip uses klue/roberta as text encoder and openai/clip-vit-base-patch32 as image encoder. Larger model koclip-large uses klue/roberta as text encoder and bigger google/vit-large-patch16-224 as image encoder.
18
- """
 
19
  )
20
 
21
  query = st.file_uploader("Choose an image...", type=["jpg", "jpeg", "png"])
 
13
  st.title("Zero-shot Image Classification")
14
  st.markdown(
15
  """
16
+ This demonstration explores capability of KoCLIP in the field of Zero-Shot Prediction. This demo takes a set of image and captions from, and predicts the most likely label among the different captions given.
17
+
18
+ KoCLIP is a retraining of OpenAI's CLIP model using 82,783 images from [MSCOCO](https://cocodataset.org/#home) dataset and Korean caption annotations. Korean translation of caption annotations were obtained from [AI Hub](https://aihub.or.kr/keti_data_board/visual_intelligence). Base model `koclip` uses `klue/roberta` as text encoder and `openai/clip-vit-base-patch32` as image encoder. Larger model `koclip-large` uses `klue/roberta` as text encoder and bigger `google/vit-large-patch16-224` as image encoder.
19
+ """
20
  )
21
 
22
  query = st.file_uploader("Choose an image...", type=["jpg", "jpeg", "png"])