wbi-sg commited on
Commit
c70bdf4
1 Parent(s): 7a8f1e5

First model version

Browse files
Files changed (4) hide show
  1. README.md +96 -3
  2. loss.tsv +95 -0
  3. pytorch_model.bin +3 -0
  4. training.log +0 -0
README.md CHANGED
@@ -1,3 +1,96 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - flair
4
+ - hunflair
5
+ - token-classification
6
+ - sequence-tagger-model
7
+ language: en
8
+ widget:
9
+ - text: "Two putative extended promoter consensus sequences (p1 and p2)."
10
+ ---
11
+
12
+ ## HunFlair model for PROMOTER
13
+
14
+ [HunFlair](https://github.com/flairNLP/flair/blob/master/resources/docs/HUNFLAIR.md) (biomedical flair) for promoter entity.
15
+
16
+
17
+ Predicts 1 tag:
18
+
19
+ | **tag** | **meaning** |
20
+ |---------------------------------|-----------|
21
+ | Promoter | DNA promoter region |
22
+
23
+ ---
24
+
25
+ ### Demo: How to use in Flair
26
+
27
+ Requires:
28
+ - **[Flair](https://github.com/flairNLP/flair/)** (`pip install flair`)
29
+
30
+ ```python
31
+ from flair.data import Sentence
32
+ from flair.models import SequenceTagger
33
+ # for biomedical-specific tokenization:
34
+ # from flair.tokenization import SciSpacyTokenizer
35
+
36
+ # load tagger
37
+ tagger = SequenceTagger.load("regel-corpus/hunflair-promoter")
38
+
39
+ text = "The upstream region of the glnA gene contained two putative extended promoter consensus sequences (p1 and p2)."
40
+
41
+ # make example sentence
42
+ sentence = Sentence(text)
43
+
44
+ # for biomedical-specific tokenization:
45
+ # sentence = Sentence(text, use_tokenizer=SciSpacyTokenizer())
46
+
47
+ # predict NER tags
48
+ tagger.predict(sentence)
49
+
50
+ # print sentence
51
+ print(sentence)
52
+
53
+ # print predicted NER spans
54
+ print('The following NER tags are found:')
55
+ # iterate over entities and print
56
+ for entity in sentence.get_spans('ner'):
57
+ print(entity)
58
+
59
+ ```
60
+
61
+ This yields the following output:
62
+
63
+ ```
64
+ Span [16]: "p1" [− Labels: Promoter (0.9878)]
65
+ Span [18]: "p2" [− Labels: Promoter (0.9216)]
66
+ ```
67
+
68
+ So, the entities "*p1*" and "*p2*" (labeled as a **promoter**) are found in the sentence.
69
+
70
+ Alternatively download all models locally and use the `MultiTagger` class.
71
+
72
+ ```python
73
+ from flair.models import MultiTagger
74
+
75
+ tagger = [
76
+ './models/hunflair-promoter/pytorch_model.bin',
77
+ './models/hunflair-enhancer/pytorch_model.bin',
78
+ './models/hunflair-tfbs/pytorch_model.bin',
79
+ ]
80
+
81
+ tagger = MultiTagger.load(['./models/hunflair-'])
82
+
83
+ tagger.predict(sentence)
84
+ ```
85
+
86
+ ---
87
+
88
+ ### Cite
89
+
90
+ Please cite the following paper when using this model.
91
+
92
+ ```
93
+ TODO
94
+ ```
95
+
96
+
loss.tsv ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP BAD_EPOCHS LEARNING_RATE TRAIN_LOSS
2
+ 1 10:23:55 0 0.1000 0.14414316049924025
3
+ 2 10:24:20 0 0.1000 0.029232208457494142
4
+ 3 10:24:44 0 0.1000 0.020709762981633278
5
+ 4 10:25:08 0 0.1000 0.016166533528691844
6
+ 5 10:25:35 0 0.1000 0.013877521233015666
7
+ 6 10:26:00 0 0.1000 0.012535358185859491
8
+ 7 10:26:24 0 0.1000 0.009929127666810905
9
+ 8 10:26:53 0 0.1000 0.009904013860608667
10
+ 9 10:27:21 0 0.1000 0.007673729182919577
11
+ 10 10:27:51 0 0.1000 0.007568819448139311
12
+ 11 10:28:19 0 0.1000 0.006279606966907699
13
+ 12 10:28:47 0 0.1000 0.005612897127472563
14
+ 13 10:29:14 0 0.1000 0.005172219379267985
15
+ 14 10:29:42 0 0.1000 0.004260889965477517
16
+ 15 10:30:11 1 0.1000 0.005011991191810867
17
+ 16 10:30:39 0 0.1000 0.004138073429066629
18
+ 17 10:31:10 1 0.1000 0.0047193605160269735
19
+ 18 10:31:38 0 0.1000 0.0038242106113829367
20
+ 19 10:32:04 0 0.1000 0.0031533638335353143
21
+ 20 10:32:32 1 0.1000 0.0033676612401528742
22
+ 21 10:33:02 0 0.1000 0.002569373648697204
23
+ 22 10:33:33 1 0.1000 0.004006711259657886
24
+ 23 10:34:04 2 0.1000 0.0038302141879308963
25
+ 24 10:34:34 3 0.1000 0.0034827371825209463
26
+ 25 10:35:04 0 0.1000 0.0022203847949101236
27
+ 26 10:35:34 1 0.1000 0.0035446000232492933
28
+ 27 10:36:03 2 0.1000 0.0027407845299534566
29
+ 28 10:36:34 0 0.1000 0.0017049528980459916
30
+ 29 10:37:03 1 0.1000 0.0029220726286789795
31
+ 30 10:37:33 2 0.1000 0.0023985954273666736
32
+ 31 10:38:04 3 0.1000 0.002136745013820866
33
+ 32 10:38:34 4 0.1000 0.0019484500696445469
34
+ 33 10:39:05 0 0.0500 0.0015295510425759754
35
+ 34 10:39:36 0 0.0500 0.0013157812400466545
36
+ 35 10:40:06 1 0.0500 0.0015454900085796826
37
+ 36 10:40:37 2 0.0500 0.0015408478840587805
38
+ 37 10:41:07 0 0.0500 0.001040015612462077
39
+ 38 10:41:36 0 0.0500 0.0010245211081360034
40
+ 39 10:42:06 1 0.0500 0.0014156307836725886
41
+ 40 10:42:35 2 0.0500 0.0012796116301640845
42
+ 41 10:43:05 0 0.0500 0.0010081329515074624
43
+ 42 10:43:35 0 0.0500 0.0008922960641543727
44
+ 43 10:44:06 1 0.0500 0.0011115807490792674
45
+ 44 10:44:36 0 0.0500 0.0007814267874369996
46
+ 45 10:45:06 1 0.0500 0.0010762021693876463
47
+ 46 10:45:36 2 0.0500 0.0008758938196161811
48
+ 47 10:46:05 3 0.0500 0.0010533888772798884
49
+ 48 10:46:34 0 0.0500 0.0007460460114560931
50
+ 49 10:47:04 1 0.0500 0.0009105747707706518
51
+ 50 10:47:34 2 0.0500 0.001265567517687931
52
+ 51 10:48:05 3 0.0500 0.0007957896883349667
53
+ 52 10:48:33 4 0.0500 0.0009136888517542425
54
+ 53 10:49:02 1 0.0250 0.0009179902546400623
55
+ 54 10:49:32 2 0.0250 0.0007747437363646763
56
+ 55 10:50:02 0 0.0250 0.0004947569001327801
57
+ 56 10:50:33 1 0.0250 0.0006043193451525759
58
+ 57 10:51:03 2 0.0250 0.0005192691506143094
59
+ 58 10:51:32 3 0.0250 0.0009651290532809225
60
+ 59 10:52:02 4 0.0250 0.0005995979365671106
61
+ 60 10:52:32 1 0.0125 0.0006073628282967953
62
+ 61 10:53:02 2 0.0125 0.000751265562195149
63
+ 62 10:53:32 0 0.0125 0.0003795679379052658
64
+ 63 10:54:02 1 0.0125 0.0005967257244143705
65
+ 64 10:54:32 2 0.0125 0.00040644731336306905
66
+ 65 10:55:01 3 0.0125 0.00038474730248631633
67
+ 66 10:55:30 4 0.0125 0.0004631287515121994
68
+ 67 10:56:00 1 0.0063 0.0006316340607622421
69
+ 68 10:56:31 2 0.0063 0.0006514909298415971
70
+ 69 10:57:00 3 0.0063 0.0006214523010719282
71
+ 70 10:57:29 4 0.0063 0.00041863117870770584
72
+ 71 10:58:00 1 0.0031 0.0005170064977770152
73
+ 72 10:58:30 0 0.0031 0.00034606219938236983
74
+ 73 10:59:01 1 0.0031 0.0006326316905439407
75
+ 74 10:59:30 2 0.0031 0.0007171690416408774
76
+ 75 10:59:59 3 0.0031 0.000689324585012857
77
+ 76 11:00:30 4 0.0031 0.00037845982092203986
78
+ 77 11:00:59 1 0.0016 0.0005252659305373181
79
+ 78 11:01:29 2 0.0016 0.0006309600967474817
80
+ 79 11:01:59 3 0.0016 0.0006728906822507321
81
+ 80 11:02:28 4 0.0016 0.00040474677207810093
82
+ 81 11:02:56 1 0.0008 0.0008017414542208203
83
+ 82 11:03:24 0 0.0008 0.0002959322525987299
84
+ 83 11:03:53 1 0.0008 0.0006421615282579078
85
+ 84 11:04:21 2 0.0008 0.00044524264644547337
86
+ 85 11:04:50 3 0.0008 0.00036031093281027734
87
+ 86 11:05:20 4 0.0008 0.000411238132604508
88
+ 87 11:05:49 1 0.0004 0.0003842137822674801
89
+ 88 11:06:19 2 0.0004 0.0005520959587385654
90
+ 89 11:06:48 3 0.0004 0.0003467164363663418
91
+ 90 11:07:16 4 0.0004 0.00045523180541593295
92
+ 91 11:07:44 1 0.0002 0.00044073285330061246
93
+ 92 11:08:13 2 0.0002 0.00037618968882934147
94
+ 93 11:08:41 3 0.0002 0.0004827663478180362
95
+ 94 11:09:09 4 0.0002 0.00035126271561428585
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d0e6bb5ec6755cdbf59a452c936a164e042d111f2a7d857c4feff2fa5d5a5300
3
+ size 1104819835
training.log ADDED
The diff for this file is too large to render. See raw diff