llmware/industry-bert-contracts-v0.1 · pooling method and results

First of all, thanks for the nice work and sharing your model 🤗

I have a few questions regarding the family of models "industry bert" your team has provided here:

what kind of pooling method do we use to obtain embeddings from your models? I saw here that you use the CLS (i.e first embedding vector for HF) but would like to be sure
do you normalize your embeddings ?
what kind of distance metric is to be used? L2 / dot ?
have you validated your models on some kind of industry retrieval benchmark ? if so would you be comfortable sharing it ?

also, just FYI (and you probably already know this) to make your model easily loadable via the Sentence BERT framework, you could attach a configuration file such as this model (you obtain these files when you serialize the model via the Sentence Bert Model class): it would make your model seamlessly usable.