Update README.md
Browse files
README.md
CHANGED
@@ -11,9 +11,9 @@ tags:
|
|
11 |
- promoter-prediction
|
12 |
- phage
|
13 |
---
|
14 |
-
## ProkBERT-mini-
|
15 |
|
16 |
-
This finetuned model is specifically designed for promoter identification and is based on the [ProkBERT-mini model](https://huggingface.co/neuralbioinfo/prokbert-mini).
|
17 |
|
18 |
For more details, refer to the [pahge dataset description](https://huggingface.co/datasets/neuralbioinfo/phage-test-10k) used for training and evaluating this model.
|
19 |
|
@@ -37,9 +37,9 @@ The following example demonstrates how to use the ProkBERT-mini-promoter model f
|
|
37 |
```python
|
38 |
from prokbert.prokbert_tokenizer import ProkBERTTokenizer
|
39 |
from transformers import MegatronBertForSequenceClassification
|
40 |
-
finetuned_model = "neuralbioinfo/prokbert-mini-phage"
|
41 |
kmer = 6
|
42 |
-
shift=
|
43 |
|
44 |
tok_params = {'kmer' : kmer,
|
45 |
'shift' : shift}
|
@@ -61,18 +61,19 @@ print(outputs)
|
|
61 |
**Architecture:**
|
62 |
|
63 |
...
|
64 |
-
**Tokenizer:** The model uses a 6-mer tokenizer with a shift of
|
65 |
|
66 |
**Parameters:**
|
67 |
|
68 |
| Parameter | Description |
|
69 |
|----------------------|--------------------------------------|
|
70 |
-
| Model Size |
|
71 |
-
| Max. Context Size |
|
72 |
| Training Data | 206.65 billion nucleotides |
|
73 |
| Layers | 6 |
|
74 |
| Attention Heads | 6 |
|
75 |
|
|
|
76 |
### Intended Use
|
77 |
|
78 |
**Intended Use Cases:** ProkBERT-mini-phage is intended for bioinformatics researchers and practitioners focusing on genomic sequence analysis, including:
|
|
|
11 |
- promoter-prediction
|
12 |
- phage
|
13 |
---
|
14 |
+
## ProkBERT-mini-long-phage Model
|
15 |
|
16 |
+
This finetuned model is specifically designed for promoter identification and is based on the [ProkBERT-mini-long model](https://huggingface.co/neuralbioinfo/prokbert-mini-long).
|
17 |
|
18 |
For more details, refer to the [pahge dataset description](https://huggingface.co/datasets/neuralbioinfo/phage-test-10k) used for training and evaluating this model.
|
19 |
|
|
|
37 |
```python
|
38 |
from prokbert.prokbert_tokenizer import ProkBERTTokenizer
|
39 |
from transformers import MegatronBertForSequenceClassification
|
40 |
+
finetuned_model = "neuralbioinfo/prokbert-mini-long-phage"
|
41 |
kmer = 6
|
42 |
+
shift= 2
|
43 |
|
44 |
tok_params = {'kmer' : kmer,
|
45 |
'shift' : shift}
|
|
|
61 |
**Architecture:**
|
62 |
|
63 |
...
|
64 |
+
**Tokenizer:** The model uses a 6-mer tokenizer with a shift of 2 (k6s2), specifically designed to handle DNA sequences efficiently.
|
65 |
|
66 |
**Parameters:**
|
67 |
|
68 |
| Parameter | Description |
|
69 |
|----------------------|--------------------------------------|
|
70 |
+
| Model Size | 26.6 million parameters |
|
71 |
+
| Max. Context Size | 4096 bp |
|
72 |
| Training Data | 206.65 billion nucleotides |
|
73 |
| Layers | 6 |
|
74 |
| Attention Heads | 6 |
|
75 |
|
76 |
+
|
77 |
### Intended Use
|
78 |
|
79 |
**Intended Use Cases:** ProkBERT-mini-phage is intended for bioinformatics researchers and practitioners focusing on genomic sequence analysis, including:
|