Image-Text-to-Text
Transformers
Safetensors
English
idefics2
pretraining
multimodal
vision
Inference Endpoints
5 papers
VictorSanh HF staff commited on
Commit
59e3081
1 Parent(s): 12d27ed
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -111,7 +111,7 @@ Following this, we perform instruction fine-tuning on [The Cauldron](https://hug
111
  - [atlas-math-sets](https://huggingface.co/datasets/AtlasUnified/atlas-math-sets)
112
  - [goat](https://huggingface.co/datasets/tiedong/goat)
113
 
114
- We use Lora to train the parameters initialized from pre-trained backbones and full fine-tuning for newly initialized parameters (modality connector), as we find this strategy to be more stable as long as more computationally efficient.
115
 
116
  More details (training procedure, data selection, hyper-parameters, etc.) along with lessons learned from our ablations will be available in an upcoming technical report.
117
 
 
111
  - [atlas-math-sets](https://huggingface.co/datasets/AtlasUnified/atlas-math-sets)
112
  - [goat](https://huggingface.co/datasets/tiedong/goat)
113
 
114
+ We use Lora to train the parameters initialized from pre-trained backbones and full fine-tuning for newly initialized parameters (modality connector), as we find this strategy to be more stable as well as more computationally efficient.
115
 
116
  More details (training procedure, data selection, hyper-parameters, etc.) along with lessons learned from our ablations will be available in an upcoming technical report.
117