--- license: llama2 --- # RankLLaMA-7B-Document [Fine-Tuning LLaMA for Multi-Stage Text Retrieval](TODO). Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, Jimmy Lin, arXiv 2023 This model is fine-tuned from LLaMA-2-7B using LoRA for document reranking, this model takes input length upto 4096 tokens. ## Usage Below is an example to compute the similarity score of a query-document pair ```python import torch from transformers import AutoModelForSequenceClassification, AutoTokenizer from peft import PeftModel, PeftConfig def get_model(peft_model_name): config = PeftConfig.from_pretrained(peft_model_name) base_model = AutoModelForSequenceClassification.from_pretrained(config.base_model_name_or_path) model = PeftModel.from_pretrained(base_model, peft_model_name) model = model.merge_and_unload() model.eval() return model # Load the tokenizer and model tokenizer = AutoTokenizer.from_pretrained('meta-llama/Llama-2-7b-hf') model = get_model('castorini/rankllama-v1-7b-lora-doc') # Define a query-document pair query = "What is llama?" url = "https://en.wikipedia.org/wiki/Llama" title = "Llama" document = "The llama is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the pre-Columbian era." # Tokenize the query-document pair inputs = tokenizer(f'query: {query}', f'document: {url} {title} {document}', return_tensors='pt') # Run the model forward with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits score = logits[0][0] print(score) ``` ## Citation If you find our paper or models helpful, please consider cite as follows: ``` TODO ```