--- license: mit language: - en --- # RankingGPT-bloom-560m RankingGPT is a text ranker based on large language models with significant in-domain and out-domain effectiveness. We provide RankingGPT in different sizes and types, including bloom-560m, bloom-1b1, bloom-3b, bloom-7b, llama2-7b, baichuan2-7b and qwen-7b. More details please refer to our [paper](https://arxiv.org/abs/2311.16720) and [github](https://github.com/Alibaba-NLP/RankingGPT). ## Usage Code example ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained('RankingGPT-bloom-560m') model = AutoModelForCausalLM.from_pretrained('RankingGPT-bloom-560m').eval() query='when should a baby walk' document='Most babies start to walk around 13 months, but your baby may start walking as early as 9 or 10 months or as late as 15 or 16 months.' context=f'Document: {document} Query:' example=context+query context_enc = tokenizer.encode(context, add_special_tokens=False) continuation_enc = tokenizer.encode(query, add_special_tokens=False) model_input = torch.tensor(context_enc+continuation_enc[:-1]) continuation_len = len(continuation_enc) input_len, = model_input.shape with torch.no_grad(): logprobs = torch.nn.functional.log_softmax(model(model_input.unsqueeze(dim=0))[0], dim=-1)[0] logprobs = logprobs[input_len-continuation_len:] logprobs = torch.gather(logprobs, 1, torch.tensor(continuation_enc).unsqueeze(-1)).squeeze(-1) score = torch.sum(logprobs)/logprobs.shape[0] print(f"Document: {document[:20] + '...'} Score: {score}") ``` ### Citation If you find our paper or models helpful, please consider citing them as follows: ``` @misc{zhang2023rankinggpt, title={RankingGPT: Empowering Large Language Models in Text Ranking with Progressive Enhancement}, author={Longhui Zhang and Yanzhao Zhang and Dingkun Long and Pengjun Xie and Meishan Zhang and Min Zhang}, year={2023}, eprint={2311.16720}, archivePrefix={arXiv}, primaryClass={cs.IR} } ```