ligeti commited on
Commit
b045f46
1 Parent(s): a25312e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -7
README.md CHANGED
@@ -11,9 +11,9 @@ tags:
11
  - promoter-prediction
12
  - phage
13
  ---
14
- ## ProkBERT-mini-pahge Model
15
 
16
- This finetuned model is specifically designed for promoter identification and is based on the [ProkBERT-mini model](https://huggingface.co/neuralbioinfo/prokbert-mini).
17
 
18
  For more details, refer to the [pahge dataset description](https://huggingface.co/datasets/neuralbioinfo/phage-test-10k) used for training and evaluating this model.
19
 
@@ -37,9 +37,9 @@ The following example demonstrates how to use the ProkBERT-mini-promoter model f
37
  ```python
38
  from prokbert.prokbert_tokenizer import ProkBERTTokenizer
39
  from transformers import MegatronBertForSequenceClassification
40
- finetuned_model = "neuralbioinfo/prokbert-mini-phage"
41
  kmer = 6
42
- shift= 1
43
 
44
  tok_params = {'kmer' : kmer,
45
  'shift' : shift}
@@ -61,18 +61,19 @@ print(outputs)
61
  **Architecture:**
62
 
63
  ...
64
- **Tokenizer:** The model uses a 6-mer tokenizer with a shift of 1 (k6s1), specifically designed to handle DNA sequences efficiently.
65
 
66
  **Parameters:**
67
 
68
  | Parameter | Description |
69
  |----------------------|--------------------------------------|
70
- | Model Size | 20.6 million parameters |
71
- | Max. Context Size | 1024 bp |
72
  | Training Data | 206.65 billion nucleotides |
73
  | Layers | 6 |
74
  | Attention Heads | 6 |
75
 
 
76
  ### Intended Use
77
 
78
  **Intended Use Cases:** ProkBERT-mini-phage is intended for bioinformatics researchers and practitioners focusing on genomic sequence analysis, including:
 
11
  - promoter-prediction
12
  - phage
13
  ---
14
+ ## ProkBERT-mini-long-phage Model
15
 
16
+ This finetuned model is specifically designed for promoter identification and is based on the [ProkBERT-mini-long model](https://huggingface.co/neuralbioinfo/prokbert-mini-long).
17
 
18
  For more details, refer to the [pahge dataset description](https://huggingface.co/datasets/neuralbioinfo/phage-test-10k) used for training and evaluating this model.
19
 
 
37
  ```python
38
  from prokbert.prokbert_tokenizer import ProkBERTTokenizer
39
  from transformers import MegatronBertForSequenceClassification
40
+ finetuned_model = "neuralbioinfo/prokbert-mini-long-phage"
41
  kmer = 6
42
+ shift= 2
43
 
44
  tok_params = {'kmer' : kmer,
45
  'shift' : shift}
 
61
  **Architecture:**
62
 
63
  ...
64
+ **Tokenizer:** The model uses a 6-mer tokenizer with a shift of 2 (k6s2), specifically designed to handle DNA sequences efficiently.
65
 
66
  **Parameters:**
67
 
68
  | Parameter | Description |
69
  |----------------------|--------------------------------------|
70
+ | Model Size | 26.6 million parameters |
71
+ | Max. Context Size | 4096 bp |
72
  | Training Data | 206.65 billion nucleotides |
73
  | Layers | 6 |
74
  | Attention Heads | 6 |
75
 
76
+
77
  ### Intended Use
78
 
79
  **Intended Use Cases:** ProkBERT-mini-phage is intended for bioinformatics researchers and practitioners focusing on genomic sequence analysis, including: