lyua1225 commited on
Commit
8bc3b83
1 Parent(s): 55e229d

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -53,7 +53,7 @@ Text encoder is the same structure as [open_clip/CLIP-VIT-H](https://huggingface
53
  3. Freeze the entire visual model, text encoder layer as well as the text projection layer. Only the text embedding layer is unfrozen. The purpose of this step is to align chinese word embedding with the original english word embedding such that the final projection latent space would not drift far away.
54
  4. After a bunch of steps, unfreeze the entire text encoder for better convergence.
55
 
56
- Notation: We use clip loss to optimize chinese text encoder. Chinese subset of [LAION-5B](https://laion.ai/blog/laion-5b/) are chosen as our training set (around 85M text-image pairs). This model was trained 75k steps with 4096 batch size so it is not completely converged at all.
57
 
58
 
59
  ## 使用 Usage
 
53
  3. Freeze the entire visual model, text encoder layer as well as the text projection layer. Only the text embedding layer is unfrozen. The purpose of this step is to align chinese word embedding with the original english word embedding such that the final projection latent space would not drift far away.
54
  4. After a bunch of steps, unfreeze the entire text encoder for better convergence.
55
 
56
+ Note: We use clip loss to optimize chinese text encoder. Chinese subset of [LAION-5B](https://laion.ai/blog/laion-5b/) are chosen as our training set (around 85M text-image pairs). This model was trained 75k steps with 4096 batch size so it is still far away from convergence.
57
 
58
 
59
  ## 使用 Usage