Gurveer05 commited on
Commit
46bc94b
·
2 Parent(s): d9d8dae ea2d1c7

Merge branch 'main' of https://huggingface.co/Gurveer05/FloraBERT

Browse files
Files changed (1) hide show
  1. README.md +17 -1
README.md CHANGED
@@ -11,4 +11,20 @@ Currently, this model is trained on **7.1 million Plant DNA promoter sequences**
11
 
12
  References:
13
  - [GitHub Repository](https://github.com/gurveervirk/florabert/)
14
- - [Kaggle Dataset](https://www.kaggle.com/datasets/gurveersinghvirk/florabert-base)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
  References:
13
  - [GitHub Repository](https://github.com/gurveervirk/florabert/)
14
+ - [Kaggle Dataset](https://www.kaggle.com/datasets/gurveersinghvirk/florabert-base)
15
+
16
+ To get predictions from **DNA promoter sequences of plants**, add your text file containing the sequences (1 sequence per line) to the data folder and call the main() function from prediction.py with your file name.
17
+ For example
18
+
19
+ - Update ```main("test.txt")``` with your file name
20
+ - Now, run ```python prediction.py```
21
+
22
+ The results will be visible in tabular format in the console.
23
+ For example,
24
+ | tassel | base | anther | middle | ear | shoot | tip | root |
25
+ |--------|--------|--------|--------|--------|--------|--------|--------|
26
+ | 0.4235 | 0.3031 | 0.3657 | 0.3663 | 0.2787 | 0.3809 | 0.4167 | 0.2861 |
27
+
28
+ The values in the table correspond to TPM values for the tissues in the plants. TPM values are normalized gene expression values.
29
+
30
+ Both models can also be further used for more pretraining and finetuning. (Check references for further information)