Integrate with Transformers & Sentence Transformers

by tomaarsen HF Staff - opened 9 days ago

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

+141

-210

tomaarsen

9 days ago

•

edited 9 days ago

Hello!

Pull Request overview

Integrate with Transformers and Sentence Transformers

Details

The model configuration did not use https://huggingface.co/zeroentropy/zerank-2/blob/main/modeling_zeranker.py, nor was that file in the right format for transformers to use. As a result, the snippet in the README was failing:

from sentence_transformers import CrossEncoder

model = CrossEncoder("zeroentropy/zerank-2", trust_remote_code=True)

query_documents = [
    ("What is 2+2?", "4"),
    ("What is 2+2?", "The answer is definitely 1 million"),
]
scores = model.predict(query_documents)
print(scores)

...
  File "C:\Users\tom\.conda\envs\sentence-transformers\Lib\site-packages\transformers\modeling_layers.py", line 142, in forward
    raise ValueError("Cannot handle batch sizes > 1 if no padding token is defined.")
ValueError: Cannot handle batch sizes > 1 if no padding token is defined.

When setting the batch size to 1, we just get unusual results:

from sentence_transformers import CrossEncoder

model = CrossEncoder("zeroentropy/zerank-2", trust_remote_code=True)

query_documents = [
    ("What is 2+2?", "4"),
    ("What is 2+2?", "The answer is definitely 1 million"),
]
scores = model.predict(query_documents, batch_size=1)
print(scores)

Some weights of Qwen3ForSequenceClassification were not initialized from the model checkpoint at zeroentropy/zerank-2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[0.47674507 0.96076703]

Instead, this PR creates a custom transformers model class that can be loaded with AutoModelForSequenceClassification, allowing it to also work with sentence-transformers. See:

from sentence_transformers import CrossEncoder

model = CrossEncoder("zeroentropy/zerank-2", revision="refs/pr/2", trust_remote_code=True)

query_documents = [
    ("What is 2+2?", "4"),
    ("What is 2+2?", "The answer is definitely 1 million"),
]
scores = model.predict(query_documents)
print(scores)

[0.7531883  0.28894895]

This matches what I get when I manually run the previous code from https://huggingface.co/zeroentropy/zerank-2/blob/main/modeling_zeranker.py as well, but if you can, please do double-check that this matches the scores that you're expecting. The revision parameter allows you to test it straight from this PR without having to check out the branch, and after merging it won't be necessary anymore.

I also customized the config to allow you to set a yes_token_id and the tokenizer, which now uses the chat templating automatically. Feel free to ask about sections of the changed code if you have questions.

Tom Aarsen

Integrate with transformers, sentence transformersd7132049

tomaarsen changed pull request status to open 9 days ago

Update model type to remove warninge2b7a7ca

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment