Scaling Pre-training to One Hundred Billion Data for Vision Language Models Paper โข 2502.07617 โข Published Feb 11 โข 29 โข 4