thenlper commited on
Commit
57224f8
1 Parent(s): 8063630

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -3171,7 +3171,7 @@ model-index:
3171
 
3172
  ## gte-Qwen2-7B-instruct
3173
 
3174
- **gte-Qwen2-7B-instruct** is the latest model in the gte (General Text Embedding) model family.
3175
 
3176
  Recently, the [**Qwen team**](https://huggingface.co/Qwen) released the Qwen2 series models, and we have trained the **gte-Qwen2-7B-instruct** model based on the [Qwen2-7B](https://huggingface.co/Qwen/Qwen2-7B) LLM model. Compared to the [gte-Qwen1.5-7B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen1.5-7B-instruct) model, the **gte-Qwen2-7B-instruct** model uses the same training data and training strategies during the finetuning stage, with the only difference being the upgraded base model to Qwen2-7B. Considering the improvements in the Qwen2 series models compared to the Qwen1.5 series, we can also expect consistent performance enhancements in the embedding models.
3177
 
@@ -3302,6 +3302,19 @@ You can use the [scripts/eval_mteb.py](https://huggingface.co/Alibaba-NLP/gte-Qw
3302
 
3303
  The gte series models have consistently released two types of models: encoder-only models (based on the BERT architecture) and decode-only models (based on the LLM architecture).
3304
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3305
  ## Citation
3306
 
3307
  If you find our paper or models helpful, please consider cite:
 
3171
 
3172
  ## gte-Qwen2-7B-instruct
3173
 
3174
+ **gte-Qwen2-7B-instruct** is the latest model in the gte (General Text Embedding) model family that ranks **No.1** in both English and Chinese evaluations on the Massive Text Embedding Benchmark (MTEB benchmark)[https://huggingface.co/spaces/mteb/leaderboard] (as of June 16, 2024).
3175
 
3176
  Recently, the [**Qwen team**](https://huggingface.co/Qwen) released the Qwen2 series models, and we have trained the **gte-Qwen2-7B-instruct** model based on the [Qwen2-7B](https://huggingface.co/Qwen/Qwen2-7B) LLM model. Compared to the [gte-Qwen1.5-7B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen1.5-7B-instruct) model, the **gte-Qwen2-7B-instruct** model uses the same training data and training strategies during the finetuning stage, with the only difference being the upgraded base model to Qwen2-7B. Considering the improvements in the Qwen2 series models compared to the Qwen1.5 series, we can also expect consistent performance enhancements in the embedding models.
3177
 
 
3302
 
3303
  The gte series models have consistently released two types of models: encoder-only models (based on the BERT architecture) and decode-only models (based on the LLM architecture).
3304
 
3305
+ | Models | Language | Max Sequence Length | Dimension | Model Size (Memory Usage, fp32) |
3306
+ |:-------------------------------------------------------------------------------------:|:--------:|:-----: |:---------:|:-------------------------------:|
3307
+ | [GTE-large-zh](https://huggingface.co/thenlper/gte-large-zh) | Chinese | 512 | 1024 | 1.25GB |
3308
+ | [GTE-base-zh](https://huggingface.co/thenlper/gte-base-zh) | Chinese | 512 | 512 | 0.41GB |
3309
+ | [GTE-small-zh](https://huggingface.co/thenlper/gte-small-zh) | Chinese | 512 | 512 | 0.12GB |
3310
+ | [GTE-large](https://huggingface.co/thenlper/gte-large) | English | 512 | 1024 | 1.25GB |
3311
+ | [GTE-base](https://huggingface.co/thenlper/gte-base) | English | 512 | 512 | 0.21GB |
3312
+ | [GTE-small](https://huggingface.co/thenlper/gte-small) | English | 512 | 384 | 0.10GB |
3313
+ | [GTE-large-en-v1.5](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5) | English | 8192 | 1024 | 1.74GB |
3314
+ | [GTE-base-en-v1.5](https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5) | English | 8192 | 768 | 0.51GB |
3315
+ | [GTE-Qwen1.5-7B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen1.5-7B-instruct) | Multilingual | 32000 | 4096 | 26.45GB |
3316
+ | [GTE-Qwen2-7B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-7B-instruct) | Multilingual | 32000 | 4096 | 26.45GB |
3317
+
3318
  ## Citation
3319
 
3320
  If you find our paper or models helpful, please consider cite: