Thresholding matches

#2
by jbergq - opened

For rare parts, the top matches are usually correct but the remaining ones are false positives. Have you considered thresholding the similarity score between embeddings and only showing matches above the threshold? You could also tune this for a good trade-off between true positives and false negatives with an ROC curve

Identifying exact matches is of course something that will be very important when turning this thing into a product, good to see that you identify the importance of that.

But the way I look at it, it's not absolutely important that this foundation model, that has been trained in a unsupervised way, can handle that. My idea for a road map for this goes something like:

  1. Collect a bigger dataset (>100 batches)
  2. Pick a target part
  3. Use the foundation model to pick ~20-50 top candidates for being exact matches
  4. Do manual binary labeling of these candidates
  5. Split data into train/test splits
  6. Train a classifier on the trianing data, bare in mind that we can train on far more than 20-50 data points because we have a lot of images of the same parts.
  7. Evaluate result on test data

But, if the model can do this classification without supervision, that would be awesome. I'm more than ready to be proven wrong!

I see! I think I lack some understanding of the exact future use-case you have in mind. A classifier might be the way to go if you are looking for a specific set of parts that are known beforehand. It would be limited to parts it has been trained on though and would need labeled examples and retraining to learn new parts. What I like about the current approach is that it is not limited to a fixed set of parts in the same way. A good, general feature extractor could find matches for parts that it has never even seen before. These two approaches differ slightly in how they operate though; the classifier answers the question "what part (or part category) is this?" while the feature extractor answers "are these two parts the same?". Depending on the future use-cases you have in mind, there could be pros and cons of each.

If you find the feature extractor compelling, I think the crux will be how to make it good enough to work for new parts. There are many different approaches, both supervised and unsupervised. Not sure what you use currently, but if you like the idea it might be worth looking into more. HF has a great guide about image similarity systems which might be interesting for inspiration. There has also been an explosion of so called vector databases recently, which specialize in efficient comparisons of many embeddings. If it suits your use-case, they would allow you to pre-embed thousands of patches and use approximate nearest-neighbour search to quickly find the best matches for a new patch.

Sign up or log in to comment