Edit model card

Cross-Encoder for MS Marco

This model was trained on the MS Marco Passage Ranking task.

The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See our paper R2ANKER for more details.

Usage with Transformers

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("YCZhou/R2ANKER")
model = AutoModelForSequenceClassification.from_pretrained("YCZhou/R2ANKER")
features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'],  padding=True, truncation=True, return_tensors="pt")
model.eval()
with torch.no_grad():
    scores = model(**features).logits
    print(scores)

Citation

@inproceedings{DBLP:conf/acl/Zhou0GTXLJJ23,
  author       = {Yucheng Zhou and
                  Tao Shen and
                  Xiubo Geng and
                  Chongyang Tao and
                  Can Xu and
                  Guodong Long and
                  Binxing Jiao and
                  Daxin Jiang},
  title        = {Towards Robust Ranker for Text Retrieval},
  booktitle    = {Findings of the Association for Computational Linguistics: {ACL} 2023,
                  Toronto, Canada, July 9-14, 2023},
  pages        = {5387--5401},
  publisher    = {Association for Computational Linguistics},
  year         = {2023},
  url          = {https://doi.org/10.18653/v1/2023.findings-acl.332},
  doi          = {10.18653/V1/2023.FINDINGS-ACL.332},
  timestamp    = {Sat, 30 Sep 2023 09:33:34 +0200},
  biburl       = {https://dblp.org/rec/conf/acl/Zhou0GTXLJJ23.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}
Downloads last month
31
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.