First model version
Browse files- README.md +96 -3
- loss.tsv +95 -0
- pytorch_model.bin +3 -0
- training.log +0 -0
README.md
CHANGED
@@ -1,3 +1,96 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- flair
|
4 |
+
- hunflair
|
5 |
+
- token-classification
|
6 |
+
- sequence-tagger-model
|
7 |
+
language: en
|
8 |
+
widget:
|
9 |
+
- text: "Two putative extended promoter consensus sequences (p1 and p2)."
|
10 |
+
---
|
11 |
+
|
12 |
+
## HunFlair model for PROMOTER
|
13 |
+
|
14 |
+
[HunFlair](https://github.com/flairNLP/flair/blob/master/resources/docs/HUNFLAIR.md) (biomedical flair) for promoter entity.
|
15 |
+
|
16 |
+
|
17 |
+
Predicts 1 tag:
|
18 |
+
|
19 |
+
| **tag** | **meaning** |
|
20 |
+
|---------------------------------|-----------|
|
21 |
+
| Promoter | DNA promoter region |
|
22 |
+
|
23 |
+
---
|
24 |
+
|
25 |
+
### Demo: How to use in Flair
|
26 |
+
|
27 |
+
Requires:
|
28 |
+
- **[Flair](https://github.com/flairNLP/flair/)** (`pip install flair`)
|
29 |
+
|
30 |
+
```python
|
31 |
+
from flair.data import Sentence
|
32 |
+
from flair.models import SequenceTagger
|
33 |
+
# for biomedical-specific tokenization:
|
34 |
+
# from flair.tokenization import SciSpacyTokenizer
|
35 |
+
|
36 |
+
# load tagger
|
37 |
+
tagger = SequenceTagger.load("regel-corpus/hunflair-promoter")
|
38 |
+
|
39 |
+
text = "The upstream region of the glnA gene contained two putative extended promoter consensus sequences (p1 and p2)."
|
40 |
+
|
41 |
+
# make example sentence
|
42 |
+
sentence = Sentence(text)
|
43 |
+
|
44 |
+
# for biomedical-specific tokenization:
|
45 |
+
# sentence = Sentence(text, use_tokenizer=SciSpacyTokenizer())
|
46 |
+
|
47 |
+
# predict NER tags
|
48 |
+
tagger.predict(sentence)
|
49 |
+
|
50 |
+
# print sentence
|
51 |
+
print(sentence)
|
52 |
+
|
53 |
+
# print predicted NER spans
|
54 |
+
print('The following NER tags are found:')
|
55 |
+
# iterate over entities and print
|
56 |
+
for entity in sentence.get_spans('ner'):
|
57 |
+
print(entity)
|
58 |
+
|
59 |
+
```
|
60 |
+
|
61 |
+
This yields the following output:
|
62 |
+
|
63 |
+
```
|
64 |
+
Span [16]: "p1" [− Labels: Promoter (0.9878)]
|
65 |
+
Span [18]: "p2" [− Labels: Promoter (0.9216)]
|
66 |
+
```
|
67 |
+
|
68 |
+
So, the entities "*p1*" and "*p2*" (labeled as a **promoter**) are found in the sentence.
|
69 |
+
|
70 |
+
Alternatively download all models locally and use the `MultiTagger` class.
|
71 |
+
|
72 |
+
```python
|
73 |
+
from flair.models import MultiTagger
|
74 |
+
|
75 |
+
tagger = [
|
76 |
+
'./models/hunflair-promoter/pytorch_model.bin',
|
77 |
+
'./models/hunflair-enhancer/pytorch_model.bin',
|
78 |
+
'./models/hunflair-tfbs/pytorch_model.bin',
|
79 |
+
]
|
80 |
+
|
81 |
+
tagger = MultiTagger.load(['./models/hunflair-'])
|
82 |
+
|
83 |
+
tagger.predict(sentence)
|
84 |
+
```
|
85 |
+
|
86 |
+
---
|
87 |
+
|
88 |
+
### Cite
|
89 |
+
|
90 |
+
Please cite the following paper when using this model.
|
91 |
+
|
92 |
+
```
|
93 |
+
TODO
|
94 |
+
```
|
95 |
+
|
96 |
+
|
loss.tsv
ADDED
@@ -0,0 +1,95 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP BAD_EPOCHS LEARNING_RATE TRAIN_LOSS
|
2 |
+
1 10:23:55 0 0.1000 0.14414316049924025
|
3 |
+
2 10:24:20 0 0.1000 0.029232208457494142
|
4 |
+
3 10:24:44 0 0.1000 0.020709762981633278
|
5 |
+
4 10:25:08 0 0.1000 0.016166533528691844
|
6 |
+
5 10:25:35 0 0.1000 0.013877521233015666
|
7 |
+
6 10:26:00 0 0.1000 0.012535358185859491
|
8 |
+
7 10:26:24 0 0.1000 0.009929127666810905
|
9 |
+
8 10:26:53 0 0.1000 0.009904013860608667
|
10 |
+
9 10:27:21 0 0.1000 0.007673729182919577
|
11 |
+
10 10:27:51 0 0.1000 0.007568819448139311
|
12 |
+
11 10:28:19 0 0.1000 0.006279606966907699
|
13 |
+
12 10:28:47 0 0.1000 0.005612897127472563
|
14 |
+
13 10:29:14 0 0.1000 0.005172219379267985
|
15 |
+
14 10:29:42 0 0.1000 0.004260889965477517
|
16 |
+
15 10:30:11 1 0.1000 0.005011991191810867
|
17 |
+
16 10:30:39 0 0.1000 0.004138073429066629
|
18 |
+
17 10:31:10 1 0.1000 0.0047193605160269735
|
19 |
+
18 10:31:38 0 0.1000 0.0038242106113829367
|
20 |
+
19 10:32:04 0 0.1000 0.0031533638335353143
|
21 |
+
20 10:32:32 1 0.1000 0.0033676612401528742
|
22 |
+
21 10:33:02 0 0.1000 0.002569373648697204
|
23 |
+
22 10:33:33 1 0.1000 0.004006711259657886
|
24 |
+
23 10:34:04 2 0.1000 0.0038302141879308963
|
25 |
+
24 10:34:34 3 0.1000 0.0034827371825209463
|
26 |
+
25 10:35:04 0 0.1000 0.0022203847949101236
|
27 |
+
26 10:35:34 1 0.1000 0.0035446000232492933
|
28 |
+
27 10:36:03 2 0.1000 0.0027407845299534566
|
29 |
+
28 10:36:34 0 0.1000 0.0017049528980459916
|
30 |
+
29 10:37:03 1 0.1000 0.0029220726286789795
|
31 |
+
30 10:37:33 2 0.1000 0.0023985954273666736
|
32 |
+
31 10:38:04 3 0.1000 0.002136745013820866
|
33 |
+
32 10:38:34 4 0.1000 0.0019484500696445469
|
34 |
+
33 10:39:05 0 0.0500 0.0015295510425759754
|
35 |
+
34 10:39:36 0 0.0500 0.0013157812400466545
|
36 |
+
35 10:40:06 1 0.0500 0.0015454900085796826
|
37 |
+
36 10:40:37 2 0.0500 0.0015408478840587805
|
38 |
+
37 10:41:07 0 0.0500 0.001040015612462077
|
39 |
+
38 10:41:36 0 0.0500 0.0010245211081360034
|
40 |
+
39 10:42:06 1 0.0500 0.0014156307836725886
|
41 |
+
40 10:42:35 2 0.0500 0.0012796116301640845
|
42 |
+
41 10:43:05 0 0.0500 0.0010081329515074624
|
43 |
+
42 10:43:35 0 0.0500 0.0008922960641543727
|
44 |
+
43 10:44:06 1 0.0500 0.0011115807490792674
|
45 |
+
44 10:44:36 0 0.0500 0.0007814267874369996
|
46 |
+
45 10:45:06 1 0.0500 0.0010762021693876463
|
47 |
+
46 10:45:36 2 0.0500 0.0008758938196161811
|
48 |
+
47 10:46:05 3 0.0500 0.0010533888772798884
|
49 |
+
48 10:46:34 0 0.0500 0.0007460460114560931
|
50 |
+
49 10:47:04 1 0.0500 0.0009105747707706518
|
51 |
+
50 10:47:34 2 0.0500 0.001265567517687931
|
52 |
+
51 10:48:05 3 0.0500 0.0007957896883349667
|
53 |
+
52 10:48:33 4 0.0500 0.0009136888517542425
|
54 |
+
53 10:49:02 1 0.0250 0.0009179902546400623
|
55 |
+
54 10:49:32 2 0.0250 0.0007747437363646763
|
56 |
+
55 10:50:02 0 0.0250 0.0004947569001327801
|
57 |
+
56 10:50:33 1 0.0250 0.0006043193451525759
|
58 |
+
57 10:51:03 2 0.0250 0.0005192691506143094
|
59 |
+
58 10:51:32 3 0.0250 0.0009651290532809225
|
60 |
+
59 10:52:02 4 0.0250 0.0005995979365671106
|
61 |
+
60 10:52:32 1 0.0125 0.0006073628282967953
|
62 |
+
61 10:53:02 2 0.0125 0.000751265562195149
|
63 |
+
62 10:53:32 0 0.0125 0.0003795679379052658
|
64 |
+
63 10:54:02 1 0.0125 0.0005967257244143705
|
65 |
+
64 10:54:32 2 0.0125 0.00040644731336306905
|
66 |
+
65 10:55:01 3 0.0125 0.00038474730248631633
|
67 |
+
66 10:55:30 4 0.0125 0.0004631287515121994
|
68 |
+
67 10:56:00 1 0.0063 0.0006316340607622421
|
69 |
+
68 10:56:31 2 0.0063 0.0006514909298415971
|
70 |
+
69 10:57:00 3 0.0063 0.0006214523010719282
|
71 |
+
70 10:57:29 4 0.0063 0.00041863117870770584
|
72 |
+
71 10:58:00 1 0.0031 0.0005170064977770152
|
73 |
+
72 10:58:30 0 0.0031 0.00034606219938236983
|
74 |
+
73 10:59:01 1 0.0031 0.0006326316905439407
|
75 |
+
74 10:59:30 2 0.0031 0.0007171690416408774
|
76 |
+
75 10:59:59 3 0.0031 0.000689324585012857
|
77 |
+
76 11:00:30 4 0.0031 0.00037845982092203986
|
78 |
+
77 11:00:59 1 0.0016 0.0005252659305373181
|
79 |
+
78 11:01:29 2 0.0016 0.0006309600967474817
|
80 |
+
79 11:01:59 3 0.0016 0.0006728906822507321
|
81 |
+
80 11:02:28 4 0.0016 0.00040474677207810093
|
82 |
+
81 11:02:56 1 0.0008 0.0008017414542208203
|
83 |
+
82 11:03:24 0 0.0008 0.0002959322525987299
|
84 |
+
83 11:03:53 1 0.0008 0.0006421615282579078
|
85 |
+
84 11:04:21 2 0.0008 0.00044524264644547337
|
86 |
+
85 11:04:50 3 0.0008 0.00036031093281027734
|
87 |
+
86 11:05:20 4 0.0008 0.000411238132604508
|
88 |
+
87 11:05:49 1 0.0004 0.0003842137822674801
|
89 |
+
88 11:06:19 2 0.0004 0.0005520959587385654
|
90 |
+
89 11:06:48 3 0.0004 0.0003467164363663418
|
91 |
+
90 11:07:16 4 0.0004 0.00045523180541593295
|
92 |
+
91 11:07:44 1 0.0002 0.00044073285330061246
|
93 |
+
92 11:08:13 2 0.0002 0.00037618968882934147
|
94 |
+
93 11:08:41 3 0.0002 0.0004827663478180362
|
95 |
+
94 11:09:09 4 0.0002 0.00035126271561428585
|
pytorch_model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d0e6bb5ec6755cdbf59a452c936a164e042d111f2a7d857c4feff2fa5d5a5300
|
3 |
+
size 1104819835
|
training.log
ADDED
The diff for this file is too large to render.
See raw diff
|
|