RaphaelMourad
commited on
Commit
•
fb0fa23
1
Parent(s):
0a11a28
Update README.md
Browse files
README.md
CHANGED
@@ -8,9 +8,9 @@ tags:
|
|
8 |
- genomics
|
9 |
---
|
10 |
|
11 |
-
# Model Card for
|
12 |
|
13 |
-
The
|
14 |
It is derived from Mistral-7B-v0.1 model, which was simplified for DNA: the number of layers and the hidden size were reduced.
|
15 |
The model was pretrained using around 700 bacterial genomes with 10kb DNA sequences.
|
16 |
|
@@ -29,8 +29,8 @@ Like Mistral-7B-v0.1, it is a transformer model, with the following architecture
|
|
29 |
import torch
|
30 |
from transformers import AutoTokenizer, AutoModel
|
31 |
|
32 |
-
tokenizer = AutoTokenizer.from_pretrained("RaphaelMourad/
|
33 |
-
model = AutoModel.from_pretrained("RaphaelMourad/
|
34 |
```
|
35 |
|
36 |
## Calculate the embedding of a DNA sequence
|
@@ -51,7 +51,7 @@ Ensure you are utilizing a stable version of Transformers, 4.34.0 or newer.
|
|
51 |
|
52 |
## Notice
|
53 |
|
54 |
-
Mistral-DNA is a pretrained base model for DNA.
|
55 |
|
56 |
## Contact
|
57 |
|
|
|
8 |
- genomics
|
9 |
---
|
10 |
|
11 |
+
# Model Card for Mistral-DNA-v1-138M-bacteria (mistral for DNA)
|
12 |
|
13 |
+
The Mistral-DNA-v1-138M-bacteria Large Language Model (LLM) is a pretrained generative DNA text model with 17.31M parameters x 8 experts = 138.5M parameters.
|
14 |
It is derived from Mistral-7B-v0.1 model, which was simplified for DNA: the number of layers and the hidden size were reduced.
|
15 |
The model was pretrained using around 700 bacterial genomes with 10kb DNA sequences.
|
16 |
|
|
|
29 |
import torch
|
30 |
from transformers import AutoTokenizer, AutoModel
|
31 |
|
32 |
+
tokenizer = AutoTokenizer.from_pretrained("RaphaelMourad/Mistral-DNA-v1-138M-bacteria", trust_remote_code=True) # Same as DNABERT2
|
33 |
+
model = AutoModel.from_pretrained("RaphaelMourad/Mistral-DNA-v1-138M-bacteria", trust_remote_code=True)
|
34 |
```
|
35 |
|
36 |
## Calculate the embedding of a DNA sequence
|
|
|
51 |
|
52 |
## Notice
|
53 |
|
54 |
+
Mistral-DNA-v1-138M-bacteria is a pretrained base model for DNA.
|
55 |
|
56 |
## Contact
|
57 |
|