fix: config.json

by ouz-m - opened Aug 2

base: refs/heads/main

←

from: refs/pr/4

Discussion Files changed

-1

ouz-m

Aug 2

updated max_position_embeddings to be the max model input length.

fix: config.json7c3c3027

ouz-m

Aug 2

•

edited Aug 2

from https://github.com/michaelfeil/infinity/issues/330

juliuslipp

Mixedbread org Aug 2

Thanks @ouz-m

juliuslipp changed pull request status to merged Aug 2

michaelfeil

Aug 2

@ouz-m Have you tested it post merge / this PR?

ouz-m

Aug 3

@michaelfeil now had a chance to test it, works!

tomaarsen

Aug 5

@juliuslipp @ouz-m @michaelfeil

I have my concerns that this does not work with Torch: https://github.com/UKPLab/sentence-transformers/issues/2873
How to reproduce:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("mixedbread-ai/deepset-mxbai-embed-de-large-v1")

Very simply, model.safetensors its embeddings.position_embeddings.weight has shape [514, 1024], which can't be loaded into a model with shape [512, 1024].

Tom Aarsen

aamirshakir

Mixedbread org Aug 5

Hey @tomaarsen , you are right, the original XMLRoberta was trained with 514 max pos embeddings (see here). You can find the explanation here.

@michaelfeil I think the right fix would to fix Optimum instead of changing the model config. I will look closer into that.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment