izhx commited on
Commit
e7ca9d5
1 Parent(s): 66447c8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -8
README.md CHANGED
@@ -2,7 +2,7 @@
2
  license: apache-2.0
3
  ---
4
 
5
- ## gte-multilingual-base
6
 
7
  The **gte-multilingual-reranker-base** model is the first reranker model in the [GTE](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469) family of models, featuring several key attributes:
8
  - **High Performance**: Achieves state-of-the-art (SOTA) results in multilingual retrieval tasks and multi-task representation model evaluations when compared to reranker models of similar size.
@@ -12,18 +12,13 @@ The **gte-multilingual-reranker-base** model is the first reranker model in the
12
 
13
 
14
  ## Model Information
15
- - Model Size: 304M
16
  - Max Input Tokens: 8192
17
 
18
- ## Requirements
19
- ```
20
- transformers>=4.39.2
21
- flash_attn>=2.5.6
22
- ```
23
 
24
  ### Usage
25
 
26
- Using Huggingface transformers
27
  ```
28
  import torch
29
  from transformers import AutoModelForSequenceClassification, AutoTokenizer
@@ -37,4 +32,20 @@ with torch.no_grad():
37
  inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
38
  scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
39
  print(scores)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
  ```
 
2
  license: apache-2.0
3
  ---
4
 
5
+ ## gte-multilingual-reranker-base
6
 
7
  The **gte-multilingual-reranker-base** model is the first reranker model in the [GTE](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469) family of models, featuring several key attributes:
8
  - **High Performance**: Achieves state-of-the-art (SOTA) results in multilingual retrieval tasks and multi-task representation model evaluations when compared to reranker models of similar size.
 
12
 
13
 
14
  ## Model Information
15
+ - Model Size: 306M
16
  - Max Input Tokens: 8192
17
 
 
 
 
 
 
18
 
19
  ### Usage
20
 
21
+ Using Huggingface transformers (transformers>=4.36.0)
22
  ```
23
  import torch
24
  from transformers import AutoModelForSequenceClassification, AutoTokenizer
 
32
  inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
33
  scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
34
  print(scores)
35
+ ```
36
+
37
+ ### How to use it offline
38
+ Refer to [Disable trust_remote_code](https://huggingface.co/Alibaba-NLP/new-impl/discussions/2#662b08d04d8c3d0a09c88fa3)
39
+
40
+ ## Citation
41
+ ```
42
+ @misc{zhang2024mgtegeneralizedlongcontexttext,
43
+ title={mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval},
44
+ author={Xin Zhang and Yanzhao Zhang and Dingkun Long and Wen Xie and Ziqi Dai and Jialong Tang and Huan Lin and Baosong Yang and Pengjun Xie and Fei Huang and Meishan Zhang and Wenjie Li and Min Zhang},
45
+ year={2024},
46
+ eprint={2407.19669},
47
+ archivePrefix={arXiv},
48
+ primaryClass={cs.CL},
49
+ url={https://arxiv.org/abs/2407.19669},
50
+ }
51
  ```