Gamma parameter breaks conversion to CTranslate2

#3
by michaelfeil - opened

When converting using CTranslate2, this BERT models seems to break, thanks to a "GAMMA" parameter, that seems to differ from the other Bert implementations. From my side, this is super hard to troubleshoot. Any comment on that?

I never used CTranslate2, however, this model is based on T5 and not based on BERT. Besides, it is important, when loding this model with the transformers library, you need to explicitly use the T5Encoder class. Our new models, e.g., jina-embeddings-v2-base-en use a backbone model which is more similar to BERT but also modified to support larger input text values. So here you have to make sure that it is using the our specific implementation (when using trasformer models this is done by setting trust_remote_code=True in the from_pretrained function. Otherwise it will fall back to a BERT implementation (and raise a warning) and the model will just produce random embeddings. I hope this helps you.

Sign up or log in to comment