Image-Text-to-Text
Transformers
Safetensors
English
idefics2
pretraining
multimodal
vision
Inference Endpoints
5 papers

How many images can it take as an input at a time?

#39
by jysung - opened

How many images can it take as an input at a time?

HuggingFaceM4 org

hi @jysung , in theory, there are no limits. in practise, I would recommend going beyond 3 or 4, since that's the upper end of the number images we saw in a single document during training

HugoLaurencon changed discussion status to closed

Sign up or log in to comment