DNA Pre-trained BERT with WordPiece

This is an ineffective model for DNA sequence prediction. The model was pre-trained using reference genomes from multiple plant species with a WordPiece tokenizer. However, the WordPiece tokenizer is not suitable for processing DNA sequences.

This model lacks predictive potential, so this repository will be removed.

Downloads last month: 4

Safetensors

Model size

56.2M params

Tensor type

F32

Dataset used to train suke-sho/BERT-plant-genome-6

Collection including suke-sho/BERT-plant-genome-6

Plant Genome BERT

Collection

5 items • Updated Mar 18