metadata

license: mit
language:
  - en

RankingGPT-bloom-560m

RankingGPT is a text ranker based on large language models with significant in-domain and out-domain effectiveness. We provide RankingGPT in different sizes and types, including bloom-560m, bloom-1b1, bloom-3b, bloom-7b, llama2-7b, baichuan2-7b and qwen-7b.

More details please refer to our paper and github.

Usage

Code example

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained('RankingGPT-bloom-560m')
model = AutoModelForCausalLM.from_pretrained('RankingGPT-bloom-560m').eval()

query='when should a baby walk'
document='Most babies start to walk around 13 months, but your baby may start walking as early as 9 or 10 months or as late as 15 or 16 months.'

context=f'Document: {document} Query:'
example=context+query

context_enc = tokenizer.encode(context, add_special_tokens=False)
continuation_enc = tokenizer.encode(query, add_special_tokens=False)
model_input = torch.tensor(context_enc+continuation_enc[:-1])
continuation_len = len(continuation_enc)
input_len, = model_input.shape


with torch.no_grad():
    logprobs = torch.nn.functional.log_softmax(model(model_input.unsqueeze(dim=0))[0], dim=-1)[0]

logprobs = logprobs[input_len-continuation_len:]
logprobs = torch.gather(logprobs, 1, torch.tensor(continuation_enc).unsqueeze(-1)).squeeze(-1)
score = torch.sum(logprobs)/logprobs.shape[0]

print(f"Document: {document[:20] + '...'} Score: {score}")

Result

	DL19	DL20	BEIR	url
MonoBERT-340M	72.3	70.3	50.5	huggingface
MonoT5-220M	71.5	69.7	49.3	huggingface
MonoT5-770M	73.2	71.2	53.1	huggingface
MonoT5-3B	72.8	74.5	54.6	huggingface
RankT5-770M	-	-	53.7	huggingface
RankLLaMA	74.6	76.6	52.5	huggingface
RankingGPT-bloom-560m	75.3	73.2	53.7	huggingface modelscope
RankingGPT-bloom-1b1	75.6	73.2	54.5	huggingface modelscope
RankingGPT-bloom-3b	76.8	73.6	56.2	huggingface modelscope
RankingGPT-bloom-7b	77.3	74.6	56.6	huggingface modelscope
RankingGPT-llama2-7b	76.2	76.3	57.8	huggingface modelscope
RankingGPT-baichuan2-7b	75.9	74.3	57.5	huggingface modelscope
RankingGPT-qwen-7b	75.8	74.3	58.3	huggingface modelscope

Citation

If you find our paper or models helpful, please consider citing them as follows:

@misc{zhang2023rankinggpt,
      title={RankingGPT: Empowering Large Language Models in Text Ranking with Progressive Enhancement}, 
      author={Longhui Zhang and Yanzhao Zhang and Dingkun Long and Pengjun Xie and Meishan Zhang and Min Zhang},
      year={2023},
      eprint={2311.16720},
      archivePrefix={arXiv},
      primaryClass={cs.IR}
}