Is CLS token included in pre-training?

#3
by pipparichter - opened

Hello! I am using the pooler_output of the EsmModel for a classification task (I am training my own classification head using the pooler_output). I have been getting the following warning:

Some weights of EsmModel were not initialized from the model checkpoint at facebook/esm2_t33_650M_UR50D and are newly initialized: ['esm.pooler.dense.weight', 'esm.pooler.dense.bias']

I am aware of the fact that some models do not include the CLS token in pre-training, and to use it to generate meaningful embeddings, additional fine-tuning must be performed. I just wanted to confirm that this was not the case here. The description for the pooler_output attribute in the docs seems to indicate it is not.

Sign up or log in to comment