Hugo Laurençon

HugoLaurencon

AI & ML interests

None yet

Recent Activity

Articles

Organizations

BigScience Workshop's profile picture Big Science - Modeling Metadata's profile picture BigScience Catalogue Data's profile picture BigScience Data's profile picture Speech Seq2Seq Experiments's profile picture Internal Data's profile picture Team 8's profile picture huggingPartyParis's profile picture Open Source Contribution's profile picture

Posts 5

view post
Post
2520
Idefics2 is trained mostly on OBELICS, our open interleaved image-text document dataset.

Training on interleaved data is crucial to reaching high performance on VQA tasks, taking an arbitrary number of images as input, and doing in-context learning.

Dataset: HuggingFaceM4/OBELICS
Nomic visualization: https://atlas.nomic.ai/map/f2fba2aa-3647-4f49-a0f3-9347daeee499/ee4a84bd-f125-4bcc-a683-1b4e231cb10f
Link to OBELICS thread: https://twitter.com/HugoLaurencon/status/1694005892839006301

models

None public yet