IDEA-CCNL
/

Taiyi-CLIP-Roberta-102M-Chinese

Feature Extraction

text-classification

Inference Endpoints

Model card Files Files and versions Community

weifeng chen commited on Jul 11, 2022

Commit

56b40e2

•

1 Parent(s): 0e7276c

update

Files changed (2) hide show

README.md +7 -5
pytorch_model.bin +1 -1

README.md CHANGED Viewed

@@ -2,6 +2,7 @@
 license: apache-2.0
 # inference: false
 # pipeline_tag: zero-shot-image-classification
 # inference:
 #   parameters:
@@ -9,6 +10,7 @@ tags:
 - clip
 - zh
 - image-text
 ---
 # Model Details
@@ -31,8 +33,8 @@ from transformers import BertForSequenceClassification, BertConfig, BertTokenize
 import numpy as np
 # 加载TaiYi 中文 text encoder
-text_tokenizer = BertTokenizer.from_pretrained("IDEA-CCNL/TaiYi-CLIP-Roberta-Chinese")
-text_encoder = BertForSequenceClassification.from_pretrained("IDEA-CCNL/TaiYi-CLIP-Roberta-Chinese").eval()
 text = text_tokenizer(["一只猫", "一只狗",'两只猫', '两只老虎','一只老虎'], return_tensors='pt', padding=True)['input_ids']
 # 加载CLIP的image encoder
@@ -59,14 +61,14 @@ with torch.no_grad():
 ### Zero-Shot Classification
 |  model   | dataset  | Top1 | Top5 |
 |  ----  | ----  | ---- | ---- |
-| TaiYi-CLIP-ViT-B-32-Roberta-Chinese  | ImageNet-CN | 40.64 % | 69.16% |
-### Text-to-Image Retrieval
 |  model   | dataset  | Top1 | Top5 | Top10 |
 |  ----  | ----  | ---- | ---- | ---- |
 | TaiYi-CLIP-ViT-B-32-Roberta-Chinese  | COCO-CN | 25.47 % | 51.70%  | 63.07% |
-| TaiYi-CLIP-ViT-B-32-Roberta-Chinese  | wukong50k | 47.64 % | 80.97% | 89.51% |
 # Citation

 license: apache-2.0
 # inference: false
 # pipeline_tag: zero-shot-image-classification
+pipeline_tag: feature-extraction
 # inference:
 #   parameters:
 - clip
 - zh
 - image-text
+- feature-extraction
 ---
 # Model Details
 import numpy as np
 # 加载TaiYi 中文 text encoder
+text_tokenizer = BertTokenizer.from_pretrained("IDEA-CCNL/TaiYi-CLIP-Roberta-102M-Chinese")
+text_encoder = BertForSequenceClassification.from_pretrained("IDEA-CCNL/TaiYi-CLIP-Roberta-102M-Chinese").eval()
 text = text_tokenizer(["一只猫", "一只狗",'两只猫', '两只老虎','一只老虎'], return_tensors='pt', padding=True)['input_ids']
 # 加载CLIP的image encoder
 ### Zero-Shot Classification
 |  model   | dataset  | Top1 | Top5 |
 |  ----  | ----  | ---- | ---- |
+| TaiYi-CLIP-ViT-B-32-Roberta-Chinese  | ImageNet1k-CN | 41.00% | 69.19% |
+### Zero-Shot Text-to-Image Retrieval
 |  model   | dataset  | Top1 | Top5 | Top10 |
 |  ----  | ----  | ---- | ---- | ---- |
 | TaiYi-CLIP-ViT-B-32-Roberta-Chinese  | COCO-CN | 25.47 % | 51.70%  | 63.07% |
+| TaiYi-CLIP-ViT-B-32-Roberta-Chinese  | wukong50k | 48.67 % | 81.77% | 90.09% |
 # Citation

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2c409d7373abd00263b4a4af3adbf38468636c51d10675b2a4efb7d05a0d5115
 size 410713709

 version https://git-lfs.github.com/spec/v1
+oid sha256:53ec5505ee1ce25f970c5ce488bbd49b5727c36faa2132de0f2cf82dddbf3e37
 size 410713709