Error
I get this error when loading the model
No sentence-transformers model found with name /root/.cache/torch/sentence_transformers/T-Systems-onsite_cross-en-de-roberta-sentence-transformer. Creating a new one with MEAN pooling.
It is not an error but a warning:
WARNING:sentence_transformers.SentenceTransformer:No sentence-transformers model found with name /root/.cache/torch/sentence_transformers/T-Systems-onsite_cross-en-de-roberta-sentence-transformer. Creating a new one with MEAN pooling.
The example given on the model card works.
It can be ignored.
Thanks for the immediate response, but when i create a vector store I get an error, which is something I didnt face in other sentence transformers. I will paste the error below
from langchain.vectorstores import FAISS
vectorstore_1=FAISS.from_documents(chunked_documents, embeddings_1)
--> 602 embeddings = embedding.embed_documents(texts)
603 return cls.__from(
604 texts,
605 embeddings,
(...)
609 **kwargs,
610 )
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1614, in Module.getattr(self, name)
1612 if name in modules:
1613 return modules[name]
-> 1614 raise AttributeError("'{}' object has no attribute '{}'".format(
1615 type(self).name, name))
AttributeError: 'SentenceTransformer' object has no attribute 'embed_documents'
Can you please provide a full minimal code example?
And double check if it works with this: deutsche-telekom/gbert-large-paraphrase-cosine
So I chunk my documents using character text splitter and through FAISS GPU, i pass in the embeddings from sentence transformers and chunked documents pass into the vector store. it works well for all other sentence transformers unfortunately.
I had used all-mpnet-base-v2 and all-MiniLM-L6-v2 and it works fine. Since I wanted for german use case, I wanted to try your model.
from sentence_transformers import SentenceTransformer
embeddings_1 = SentenceTransformer('T-Systems-onsite/cross-en-de-roberta-sentence-transformer')
from langchain.vectorstores import FAISS
vectorstore_1=FAISS.from_documents(chunked_documents, embeddings_1)
vectorstore_1
This is how i tried for other models,
from langchain.embeddings import HuggingFaceEmbeddings
embeddings_1 = HuggingFaceEmbeddings(model_name='sentence-transformers/all-mpnet-base-v2',model_kwargs={'device': 'cuda'})
For the deutsche-telekom/gbert-large-paraphrase-cosine, it works fine.
But I see that, it has not been a model that has been most suggestedly used.