Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
OpenGVLab
/
InternVL-Chat-V1-5
like
315
Visual Question Answering
Transformers
TensorBoard
Safetensors
laion/laion2B-en
laion/laion-coco
laion/laion2B-multi
kakaobrain/coyo-700m
conceptual_captions
wanng/wukong100m
internvl_chat
feature-extraction
custom_code
arxiv:
2312.14238
arxiv:
2404.16821
License:
mit
Model card
Files
Files and versions
Metrics
Training metrics
Community
20
Train
Use this model
使用多图输入,模型并不能区分每一张图片,而是把它当作了一张拼接的图片?
#17
by
jamestang0219
- opened
25 days ago
Discussion
jamestang0219
25 days ago
似乎从代码上看,处理vit到llm的embedding是这样的。那是否说明模型其实并不能理解哪些token属于一张图片?
See translation
Edit
Preview
Upload images, audio, and videos by dragging in the text input, pasting, or
clicking here
.
Tap or paste here to upload images
Comment
·
Sign up
or
log in
to comment