Alibaba-NLP
/

gte-multilingual-reranker-base

@@ -1,5 +1,10 @@
 ---
 license: apache-2.0
 ---
 ## gte-multilingual-reranker-base
@@ -23,8 +28,10 @@ Using Huggingface transformers (transformers>=4.36.0)
 import torch
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
-tokenizer = AutoTokenizer.from_pretrained('Alibaba-NLP/gte-multilingual-reranker-base')
-model = AutoModelForSequenceClassification.from_pretrained('Alibaba-NLP/gte-multilingual-reranker-base', trust_remote_code=True)
 model.eval()
 pairs = [["中国的首都在哪儿"，"北京"], ["what is the capital of China?", "北京"], ["how to implement quick sort in python?","Introduction of quick sort"]]
@@ -32,11 +39,31 @@ with torch.no_grad():
     inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
     scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
     print(scores)
 ```
 ### How to use it offline
 Refer to [Disable trust_remote_code](https://huggingface.co/Alibaba-NLP/new-impl/discussions/2#662b08d04d8c3d0a09c88fa3)
 ## Citation
 ```
 @misc{zhang2024mgtegeneralizedlongcontexttext,

 ---
 license: apache-2.0
+pipeline_tag: text-classification
+tags:
+- transformers
+- sentence-transformers
+- text-embeddings-inference
 ---
 ## gte-multilingual-reranker-base
 import torch
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
+model_name_or_path = "Alibaba-NLP/gte-multilingual-reranker-base"
+tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
+model = AutoModelForSequenceClassification.from_pretrained(model_name_or_path, trust_remote_code=True)
 model.eval()
 pairs = [["中国的首都在哪儿"，"北京"], ["what is the capital of China?", "北京"], ["how to implement quick sort in python?","Introduction of quick sort"]]
     inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
     scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
     print(scores)
+# tensor([1.2315, 0.5923, 0.3041])
 ```
 ### How to use it offline
 Refer to [Disable trust_remote_code](https://huggingface.co/Alibaba-NLP/new-impl/discussions/2#662b08d04d8c3d0a09c88fa3)
+## Evaluation
+Results of reranking based on multiple text retreival datasets
+![image][./images/mgte-reranker.png]
+**More detailed experimental results can be found in the [paper](https://arxiv.org/pdf/2407.19669)**.
+## Cloud API Services
+In addition to the open-source [GTE](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469) series models, GTE series models are also available as commercial API services on Alibaba Cloud.
+- [Embedding Models](https://help.aliyun.com/zh/model-studio/developer-reference/general-text-embedding/): Rhree versions of the text embedding models are available: text-embedding-v1/v2/v3, with v3 being the latest API service.
+- [ReRank Models](https://help.aliyun.com/zh/model-studio/developer-reference/general-text-sorting-model/): The gte-rerank model service is available.
+Note that the models behind the commercial APIs are not entirely identical to the open-source models.
 ## Citation
 ```
 @misc{zhang2024mgtegeneralizedlongcontexttext,