Compatibility with Sentence Transformers

#1
by tomaarsen HF staff - opened

Hello!

I think this is very interesting work, it reminds me a bit of the negation dataset from Jina.

Currently Sentence Transformers doesn't automatically integrate with PEFT yet, but I'm looking into that! However, in the meantime, this script should also work:

from sentence_transformers import SentenceTransformer
from peft import PeftModel

model = SentenceTransformer("all-mpnet-base-v2")
model[0].auto_model = PeftModel.from_pretrained(model[0].auto_model, "vahidthegreat/StanceAware-SBERT")

sentences = ["I love pineapple on pizza", "I hate pineapple on pizza"]
embeddings = model.encode(sentences)
print(embeddings.shape)

similarity = model.similarity(embeddings[0], embeddings[1])
print(similarity)

Except I'm noticing identical performance regardless of whether the Peft model is applied, also with your original scripts in the README. Could you look into this perhaps?

  • Tom Aarsen

Hi. Thanks for letting me know.
I just finished correcting the model card. Now it should work.
The issue you mentioned happened for me until I realized that I finetuned the model after applying the "class SiameseNetworkMPNet" to the base model. I means that the several normalization layers I added to the base model has also been finetuned. So, in order to make the model run correctly, you need to load the model this way (the difference is that I first added the "SiameseNetworkMPNet" class and then the LoRa weights. Let me know if it works now.

class SiameseNetworkMPNet(nn.Module):
    def __init__(self, model_name, tokenizer, normalize=True):
        super(SiameseNetworkMPNet, self).__init__()

        self.model = AutoModel.from_pretrained(model_name)#, quantization_config=bnb_config, trust_remote_code=True)
        self.normalize = normalize
        self.tokenizer = tokenizer

    def forward(self, **inputs):
        model_output = self.model(**inputs)
        attention_mask = inputs['attention_mask']
        last_hidden_states = model_output.last_hidden_state  # First element of model_output contains all token embeddings
        embeddings = torch.sum(last_hidden_states * attention_mask.unsqueeze(-1), 1) / torch.clamp(attention_mask.sum(1, keepdim=True), min=1e-9) # mean_pooling
        if self.normalize:
            embeddings = F.layer_norm(embeddings, embeddings.shape[1:])
            embeddings = F.normalize(embeddings, p=2, dim=1)

        return embeddings


base_model_name = "sentence-transformers/all-mpnet-base-v2" 
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

# Load the base model
base_model = SiameseNetworkMPNet(model_name=base_model_name, tokenizer=tokenizer)

# Load and apply LoRA weights
lora_model = SiameseNetworkMPNet(model_name=base_model_name, tokenizer=tokenizer)
lora_model = PeftModel.from_pretrained(lora_model, "vahidthegreat/StanceAware-SBERT")
lora_model = lora_model.merge_and_unload()

base_model.eval()
lora_model.eval()

Sign up or log in to comment