Data Science for Social Impact org

This repository contains the trained model for our manuscript, which is currently being reviewed by BMC Bioinformatics. This model, called simcse-dna, is based on the original implementation of SimCSE. The original model was adapted for DNA downstream tasks by training it on a small sample size k-mer tokens generated from the human reference genome, and can be used to generate sentence embeddings for DNA tasks.

mmokoatle changed pull request status to merged

Sign up or log in to comment