Embedding from Eos token

#1
by manu - opened

Hey, cool work !
Since it's a decoder causal model, wouldn't it make sense to train and use the representation from the last token only (vs mean of all tokens)? Unlike in Bert like models, first tokens hold no info on the rest of the sequence !
Cheers,
Manu

Owner

Hey
Thanks for the suggestion, trying this today!
Wissam

Cool ! Let me know if you have any results !

Hi, I got strangely low results using pooling_mode_lasttoken, will investigate more to fix the issue or use a custom implementation otherwise. Will keep you posted!
Cheers

Thanks !

Sign up or log in to comment