czczup commited on
Commit
655f38e
1 Parent(s): 57e108a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -36,7 +36,7 @@ For better training reproducibility, we follow the minimalist design and data ef
36
 
37
  - **Training Strategy:**
38
  - Pretraining Stage
39
- - Learnable Component: MLP
40
  - Data: Trained on 8192x4800=39.3M samples, including COYO, LAION, CC12M, CC3M, SBU, Wukong, GRIT, Objects365, OpenImages, and OCR-related datasets.
41
  - Note: In this stage, we load the pretrained weights of [InternViT-6B-448px-V1-2](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2). Moreover, in order to reduce the number of visual tokens, we use a pixel shuffle to reduce 1024 tokens to 256 tokens.
42
  - Supervised Finetuning Stage
 
36
 
37
  - **Training Strategy:**
38
  - Pretraining Stage
39
+ - Learnable Component: ViT + MLP
40
  - Data: Trained on 8192x4800=39.3M samples, including COYO, LAION, CC12M, CC3M, SBU, Wukong, GRIT, Objects365, OpenImages, and OCR-related datasets.
41
  - Note: In this stage, we load the pretrained weights of [InternViT-6B-448px-V1-2](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2). Moreover, in order to reduce the number of visual tokens, we use a pixel shuffle to reduce 1024 tokens to 256 tokens.
42
  - Supervised Finetuning Stage