Using the pre-trained model (monot5-base-msmarco-10k) to get inference via CrossEncoders

#1
by EltonLobo - opened

I am trying to use monot5-large to generate relevance score by invoking the model using CrossEncoder.

This is the full code:
##code##
from sentence_transformers.cross_encoder import CrossEncoder
model_path="./monot5-base-msmarco-10k" #monot5
model = CrossEncoder(model_path)
question="Hello"
context="What is your name?"
print(model.predict([[question, context]]))

##output and warning##
Some weights of T5ForSequenceClassification were not initialized from the model checkpoint at ./monot5-base-msmarco-10k and are newly initialized: ['classification_head.dense.bias', 'classification_head.dense.weight', 'classification_head.out_proj.bias', 'classification_head.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[0.3089498]

On using the model.predict, all combinations of input give very similar scores as well (irrespective of the similarity or dissimilarity in text)
My understanding is that the model is pre-trained on a corpus and should be able to give distinct scores for similar and dissimilar inputs, but this is not the case. Should I fine-tune the model and if so how?

Sign up or log in to comment