Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Indobert Cross-Encoder

This is a Cross-Encoder model for ID that can be used for passage re-ranking. It was trained on the multilingual version of MS Marco Passage Ranking task.

The model can be used for Information Retrieval: See SBERT.net Retrieve & Re-rank.

Usage with SentenceTransformers

When you have SentenceTransformers installed, you can use the model like this:

from sentence_transformers import CrossEncoder
model = CrossEncoder('model_name', max_length=512)
query = 'How many people live in Berlin?'
docs = ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.']
pairs = [(query, doc) for doc in docs]
scores = model.predict(pairs)

Usage with Transformers

With the transformers library, you can use the model like this:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained('model_name')
tokenizer = AutoTokenizer.from_pretrained('model_name')

features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'],  padding=True, truncation=True, return_tensors="pt")

model.eval()
with torch.no_grad():
    scores = model(**features).logits
    print(scores)

Performance

Model Mmarco Dev MrTyDi Test Miracal Test
MRR@10 R@1000 MRR@10 R@1000 NCDG@10 R@1K
$\text{BM25 (Elastic Search)}$ .114 .642 .279 .858 .391 .971
$\text{IndoBERT}_{\text{CAT}}$ .181 .642 .447 .858 .455 .971
Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.