Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
vladbogo 
posted an update Apr 1
Post
2193
Google DeepMind introduces Gecko a new text embedding! Gecko uses a two-step process that leverages synthetic data generation and reranking.

Keypoints:
* Uses an LLM to generate diverse synthetic queries and tasks from web passages
* Refines the data by retrieving candidate passages and relabeling positives/negatives using the same LLM
* Achieves very good results on the Massive Text Embedding Benchmark, where compact 256D Gecko outperforms 768D models.
* 768D Gecko achieves state-of-the-art performance competing with models a lot larger larger.

Paper: Gecko: Versatile Text Embeddings Distilled from Large Language Models (2403.20327)
More details in my blog: https://huggingface.co/blog/vladbogo/gecko

Congrats to the authors for their work!
In this post