Cross-Encoder for Quora Duplicate Questions Detection

This model was trained using SentenceTransformers Cross-Encoder class.

This model uses roberta-large.

Training Data

This model was trained on the Quora Duplicate Questions dataset.

The model will predict a score between 0 and 1: How likely the two given questions are duplicates.

Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions "How to learn Java" and "How to learn Python" will result in a rahter low score, as these are not duplicates.

Usage and Performance

The trained model can be used like this:

from sentence_transformers import CrossEncoder

model = CrossEncoder('model_name')
scores = model.predict([('Question 1', 'Question 2'), ('Question 3', 'Question 4')])

print(scores)
Downloads last month
18
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train navteca/quora-roberta-large