Is there a way to use the huggingface model to do cross modal retrieval tasks?
There's an effort to add it: https://github.com/huggingface/transformers/pull/29261
· Sign up or log in to comment