pooling method and results
#2
by
basilevc
- opened
First of all, thanks for the nice work and sharing your model π€
I have a few questions regarding the family of models "industry bert" your team has provided here:
- what kind of pooling method do we use to obtain embeddings from your models? I saw here that you use the CLS (i.e first embedding vector for HF) but would like to be sure
- do you normalize your embeddings ?
- what kind of distance metric is to be used? L2 / dot ?
- have you validated your models on some kind of industry retrieval benchmark ? if so would you be comfortable sharing it ?
also, just FYI (and you probably already know this) to make your model easily loadable via the Sentence BERT framework, you could attach a configuration file such as this model (you obtain these files when you serialize the model via the Sentence Bert Model class): it would make your model seamlessly usable.