This is a version of paraphrase detector by DeepPavlov ([details in the documentation](http://docs.deeppavlov.ai/en/master/features/overview.html#ranking-model-docs)) ported to the `Transformers` format. All credit goes to the authors of DeepPavlov. The model has been trained on the dataset from http://paraphraser.ru/. It classifies texts as paraphrases (class 1) or non-paraphrases (class 0). ```python import torch from transformers import AutoModelForSequenceClassification, BertTokenizer model_name = 'cointegrated/rubert-base-cased-dp-paraphrase-detection' model = AutoModelForSequenceClassification.from_pretrained(model_name).cuda() tokenizer = BertTokenizer.from_pretrained(model_name) text1 = 'Сегодня на улице хорошая погода' text2 = 'Сегодня на улице отвратительная погода' batch = tokenizer(text1, text2, return_tensors='pt').to(model.device) with torch.inference_mode(): proba = torch.softmax(model(**batch).logits, -1).cpu().numpy() print(proba) # [[0.44876656 0.5512334 ]] ```