Image-Text-to-Text
Transformers
Safetensors
English
idefics2
pretraining
multimodal
vision
Inference Endpoints
5 papers
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -107,7 +107,7 @@ Following this, we perform instruction fine-tuning on [The Cauldron](https://hug
107
  - [atlas-math-sets](https://huggingface.co/datasets/AtlasUnified/atlas-math-sets)
108
  - [goat](https://huggingface.co/datasets/tiedong/goat)
109
 
110
- We use Lora to train the parameters initialized from pre-trained backbones and full fine-tuning for newly initialized parameters (modality connector), as we find this strategy to be more stable as long as more computationally efficient.
111
 
112
  More details (training procedure, data selection, hyper-parameters, etc.) along with lessons learned from our ablations will be available in an upcoming technical report.
113
 
 
107
  - [atlas-math-sets](https://huggingface.co/datasets/AtlasUnified/atlas-math-sets)
108
  - [goat](https://huggingface.co/datasets/tiedong/goat)
109
 
110
+ We use Lora to train the parameters initialized from pre-trained backbones and full fine-tuning for newly initialized parameters (modality connector), as we find this strategy to be more stable as well as more computationally efficient.
111
 
112
  More details (training procedure, data selection, hyper-parameters, etc.) along with lessons learned from our ablations will be available in an upcoming technical report.
113