Building and better understanding vision-language models: insights and future directions Paper β’ 2408.12637 β’ Published Aug 22, 2024 β’ 126
What matters when building vision-language models? Paper β’ 2405.02246 β’ Published May 3, 2024 β’ 102
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset Paper β’ 2403.09029 β’ Published Mar 14, 2024 β’ 55
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Paper β’ 2306.16527 β’ Published Jun 21, 2023 β’ 47