Model Details:
This embedding model is a fine-tuned 10.7B parameter LLM on the Intel Gaudi 2 processor using the upstage/SOLAR-10.7B-v1.0.
Date
July, 2024
Training Details
Two stage training:
- General Text Embedding Training
- Specific domains Emebedding Training
More technical details will be updated later.
Evaluation
The results of (MTEB)[https://huggingface.co/spaces/mteb/leaderboard] (English):
Model Name | MTEB(56) | |
---|---|---|
bge-base-en-1.5 | 64.23 | |
bge-large-en-1.5 | 63.55 | |
gte-large-en-v1.5 | 65.39 | |
gte-base-en-v1.5 | 64.11 | |
mxbai-embed-large-v1 | 64.68 | |
multilingual-e5-base | 59.45 | |
multilingual-e5-large | 61.50 | |
e5-mistral-7b-instruct | 66.63 | |
gte-Qwen1.5-7B-instruct | 67.34 | |
NV-Embed-v1 | 69.32 | |
neural-embedding-v1 | 69.94 |
Spaces using Intel/neural-embedding-v1 2
Evaluation results
- accuracy on MTEB AmazonCounterfactualClassification (en)test set self-reported93.104
- ap on MTEB AmazonCounterfactualClassification (en)test set self-reported72.527
- f1 on MTEB AmazonCounterfactualClassification (en)test set self-reported89.675
- accuracy on MTEB AmazonPolarityClassificationtest set self-reported97.537
- ap on MTEB AmazonPolarityClassificationtest set self-reported96.468
- f1 on MTEB AmazonPolarityClassificationtest set self-reported97.536
- accuracy on MTEB AmazonReviewsClassification (en)test set self-reported61.174
- f1 on MTEB AmazonReviewsClassification (en)test set self-reported60.405
- map_at_1 on MTEB ArguAnatest set self-reported44.452
- map_at_10 on MTEB ArguAnatest set self-reported59.563