weifeng-chen
commited on
Commit
•
901bc96
1
Parent(s):
ec67cfd
add zero dataset and achieve better result
Browse files- README.md +4 -4
- pytorch_model.bin +1 -1
README.md
CHANGED
@@ -15,7 +15,7 @@ tags:
|
|
15 |
|
16 |
# Model Details
|
17 |
|
18 |
-
This model is a Chinese CLIP model trained on [Noah-Wukong Dataset](https://wukong-dataset.github.io/wukong-dataset/)
|
19 |
|
20 |
# Taiyi (太乙)
|
21 |
Taiyi models are a branch of the Fengshenbang (封神榜) series of models. The models in Taiyi are pre-trained with multimodal pre-training strategies. We will release more image-text model trained on Chinese dataset and benefit the Chinese community.
|
@@ -65,14 +65,14 @@ with torch.no_grad():
|
|
65 |
### Zero-Shot Classification
|
66 |
| model | dataset | Top1 | Top5 |
|
67 |
| ---- | ---- | ---- | ---- |
|
68 |
-
| Taiyi-CLIP-Roberta-102M-Chinese | ImageNet1k-CN |
|
69 |
|
70 |
### Zero-Shot Text-to-Image Retrieval
|
71 |
|
72 |
| model | dataset | Top1 | Top5 | Top10 |
|
73 |
| ---- | ---- | ---- | ---- | ---- |
|
74 |
-
| Taiyi-CLIP-Roberta-102M-Chinese | Flickr30k-CNA-test |
|
75 |
-
| Taiyi-CLIP-Roberta-102M-Chinese | COCO-CN-test |
|
76 |
| Taiyi-CLIP-Roberta-102M-Chinese | wukong50k | 48.67% | 81.77% | 90.09% |
|
77 |
|
78 |
|
|
|
15 |
|
16 |
# Model Details
|
17 |
|
18 |
+
This model is a Chinese CLIP model trained on [Noah-Wukong Dataset(100M)](https://wukong-dataset.github.io/wukong-dataset/) and [Zero(23M)](https://zero.so.com/). We use ViT-B-32 from [openAI](https://github.com/openai/CLIP) as image encoder and Chinese pre-trained language model [chinese-roberta-wwm](https://huggingface.co/hfl/chinese-roberta-wwm-ext) as text encoder. We freeze the image encoder and only finetune the text encoder. The model was trained for 24 epochs and it takes about 10 days with 16 A100 GPUs.
|
19 |
|
20 |
# Taiyi (太乙)
|
21 |
Taiyi models are a branch of the Fengshenbang (封神榜) series of models. The models in Taiyi are pre-trained with multimodal pre-training strategies. We will release more image-text model trained on Chinese dataset and benefit the Chinese community.
|
|
|
65 |
### Zero-Shot Classification
|
66 |
| model | dataset | Top1 | Top5 |
|
67 |
| ---- | ---- | ---- | ---- |
|
68 |
+
| Taiyi-CLIP-Roberta-102M-Chinese | ImageNet1k-CN | 42.85% | 71.48% |
|
69 |
|
70 |
### Zero-Shot Text-to-Image Retrieval
|
71 |
|
72 |
| model | dataset | Top1 | Top5 | Top10 |
|
73 |
| ---- | ---- | ---- | ---- | ---- |
|
74 |
+
| Taiyi-CLIP-Roberta-102M-Chinese | Flickr30k-CNA-test | 46.32% | 74.58% | 83.44% |
|
75 |
+
| Taiyi-CLIP-Roberta-102M-Chinese | COCO-CN-test | 47.10% | 78.53% | 87.84% |
|
76 |
| Taiyi-CLIP-Roberta-102M-Chinese | wukong50k | 48.67% | 81.77% | 90.09% |
|
77 |
|
78 |
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 410713709
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d679dcce5801d600bce716e1fa3e13508812b9cb4ff0ff6101d12a96b3a4eae9
|
3 |
size 410713709
|