Using the pre-trained model (monot5-base-msmarco-10k) to get inference via CrossEncoders

by EltonLobo - opened

I am trying to use monot5-large to generate relevance score by invoking the model using CrossEncoder.

This is the full code:
from sentence_transformers.cross_encoder import CrossEncoder
model_path="./monot5-base-msmarco-10k" #monot5
model = CrossEncoder(model_path)
context="What is your name?"
print(model.predict([[question, context]]))

##output and warning##
Some weights of T5ForSequenceClassification were not initialized from the model checkpoint at ./monot5-base-msmarco-10k and are newly initialized: ['classification_head.dense.bias', 'classification_head.dense.weight', 'classification_head.out_proj.bias', 'classification_head.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

On using the model.predict, all combinations of input give very similar scores as well (irrespective of the similarity or dissimilarity in text)
My understanding is that the model is pre-trained on a corpus and should be able to give distinct scores for similar and dissimilar inputs, but this is not the case. Should I fine-tune the model and if so how?

Sign up or log in to comment