Dan Ofer

GrimSqueaker

AI & ML interests

Bioinformatics, Neurobiology, AutoML, Feature engineering, Proteins, NLP

Recent Activity

liked a dataset 3 days ago
ctheodoris/Genecorpus-30M
liked a model 3 days ago
ctheodoris/Geneformer
liked a model 5 days ago
chandar-lab/AMPLIFY_350M
View all activity

Organizations

None yet

GrimSqueaker's activity

view reply

I'd just start with modernBert large though, easier and strong base. Less faffing about. Also big vocab <3

view reply

They do PCA (prior to the zipf weighting) and explicitly state that they found that it improved perf.

view reply

Did you try potion/m2v as a starting point? (nvm modernbert, and it's much larger vocab)?