Arabic Matryoshka Embedding Models
A collection of advanced Arabic Matryoshka Embedding Models designed for efficient and high-performance Arabic NLP, available publicly on Hugging Face
- Running on Zero3🌍
Enhancing Semantic Similarity Understanding in Arabic NLP with Nested Embedding Learning
Paper • 2407.21139 • Published • 3
Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2
Sentence Similarity • Updated • 1.85k • 10Note Note This is a Nested Embedding Model using Matryoshka Learning , achieving the 1st on the MTEB leaderboard with STS17 85.3 . It demonstrates exceptional efficiency in various NLP tasks, including precise semantic similarity and textual entailment in Arabic.
Omartificial-Intelligence-Space/Arabic-mpnet-base-all-nli-triplet
Sentence Similarity • Updated • 1k • 10Note This model is an English fine-tuned version derived from the "tomaarsen/mpnet-base-all-nli-triplet", which itself is originally based on "microsoft/mpnet-base". Despite being primarily trained on English data and having seen only a few Arabic tokens, this model has demonstrated impressive performance in Arabic NLP tasks. After fine-tuning, it achieved a notable score of 79.9 on the STS17 MTEB leaderboard.
Omartificial-Intelligence-Space/Arabic-all-nli-triplet-Matryoshka
Sentence Similarity • Updated • 276 • 1Note This model is fine-tuned from the "sentence-transformers/paraphrase-multilingual-mpnet-base-v2". It has been specifically adapted to handle Arabic NLP tasks, making it a powerful tool for understanding and processing Arabic text. On the MTEB STS17 leaderboard, it achieved an impressive score of 82.4. It is really powerful model for sentence similarty
Omartificial-Intelligence-Space/Arabert-all-nli-triplet-Matryoshka
Sentence Similarity • Updated • 547 • 10Note This is a Nested Embedding Model using Matryoshka Learning , achieving a high score of 83.16 on the STS17 leaderboard. It demonstrates exceptional efficiency in various NLP tasks, including precise semantic similarity and textual entailment in Arabic.
Omartificial-Intelligence-Space/Arabic-labse-Matryoshka
Sentence Similarity • Updated • 268 • 2Note This sentence-transformers model, fine-tuned from sentence-transformers/LaBSE, has secured the second position on the STS17 MTEB leaderboard with a score of 82.47. It combines the strengths of LaBSE with the specific needs of Arabic language processing, making it a robust choice for tasks that require accurate semantic similarity and textual entailment in Arabic. This model is ideal for applications needing high performance and precision in understanding Arabic text.
Omartificial-Intelligence-Space/Marbert-all-nli-triplet-Matryoshka
Sentence Similarity • Updated • 223 • 1Note This model, fine-tuned from the MarBERT base, has achieved the fourth position on the STS17 MTEB leaderboard with a score of 82.18. It leverages the MarBERT architecture, which is specifically designed for Arabic language processing, enhancing its performance through Matryoshka fine-tuning.
Omartificial-Intelligence-Space/Arabic-MiniLM-L12-v2-all-nli-triplet
Sentence Similarity • Updated • 458 • 4Note This model, Omartificial-Intelligence-Space/Arabic-MiniLM-L12-v2-all-nli-triplet, fine-tuned from the sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 base, has achieved a commendable score of 81.11 on the STS17 MTEB leaderboard.
Omartificial-Intelligence-Space/E5-all-nli-triplet-Matryoshka
Sentence Similarity • Updated • 44 • 1