Introducing Idefics2: A Powerful 8B Vision-Language Model for the community
•
88
Something I am very excited about with synthetic data is the increased ability to tune the data so that they look like what you want them to look like.
We typically spend a lot of time filtering web-scale data by building heuristics that detect "poor-quality" samples. With control over the data creation process, you can quickly tune the generation process to give some specific properties to the data. Often it's just about telling your model to do X, and not to do Y.