navteca
/

quora-roberta-large

Text Classification

Inference Endpoints

Model card Files Files and versions Community

quora-roberta-large / README.md

lrodrigues's picture

upload

6c13fab over 3 years ago

|

1.16 kB

	---
	datasets:
	- quora
	language: en
	license: mit
	pipeline_tag: text-classification
	tags:
	- roberta
	- text-classification
	---
	# Cross-Encoder for Quora Duplicate Questions Detection

	This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.

	This model uses [roberta-large](https://huggingface.co/roberta-large).

	## Training Data

	This model was trained on the [Quora Duplicate Questions](https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs) dataset.

	The model will predict a score between 0 and 1: How likely the two given questions are duplicates.

	Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions "How to learn Java" and "How to learn Python" will result in a rahter low score, as these are not duplicates.

	## Usage and Performance

	The trained model can be used like this:

	```python
	from sentence_transformers import CrossEncoder

	model = CrossEncoder('model_name')
	scores = model.predict([('Question 1', 'Question 2'), ('Question 3', 'Question 4')])

	print(scores)
	```