File size: 3,344 Bytes
023447b c8521c1 7eb8862 c8521c1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
---
language: "en"
widget:
- text: "On the other hand, a decline of the arsenic content in hair and nail was observed after withdrawal of the drug."
- text: "These differences in gene expression have not been molecularly defined."
- text: "p65 was detected in the cytoplasm of FDC , whereas nuclei were negative."
- text: "These differences in gene expression have not been molecularly defined."
datasets:
- Genia
---
## A Biomedical Pos-Tagger for English
Trained with the GENIA corpus.
Eval:
```
precision recall f1-score support
0 0.98 1.00 0.99 263
3 0.93 1.00 0.97 14
5 1.00 1.00 1.00 8
6 0.99 0.99 0.99 169
7 1.00 1.00 1.00 203
8 0.99 1.00 1.00 195
9 0.95 0.78 0.85 98
10 0.83 1.00 0.91 5
11 0.96 0.97 0.96 532
12 1.00 1.00 1.00 252
13 0.99 0.98 0.99 1575
14 0.95 0.95 0.95 133
15 0.89 0.89 0.89 9
16 1.00 1.00 1.00 3
18 0.99 1.00 0.99 69
19 1.00 0.95 0.98 22
20 0.99 1.00 1.00 395
22 1.00 1.00 1.00 1328
23 1.00 1.00 1.00 987
24 1.00 1.00 1.00 6
25 0.00 0.00 0.00 0
26 1.00 1.00 1.00 620
27 0.00 0.00 0.00 1
28 1.00 1.00 1.00 39
29 0.98 0.99 0.98 5674
30 0.97 0.96 0.96 2075
31 1.00 0.71 0.83 7
32 1.00 0.80 0.89 5
33 1.00 1.00 1.00 58
34 1.00 1.00 1.00 2
35 0.96 0.96 0.96 336
37 0.99 1.00 1.00 1579
38 1.00 1.00 1.00 1446
39 1.00 0.98 0.99 57
accuracy 0.99 18165
macro avg 0.92 0.91 0.91 18165
weighted avg 0.99 0.99 0.99 18165
F1: 0.985267446136761 Accuracy: 0.9853564547206166
```
Tags:
```
{0: 'VBD',
1: 'N',
2: 'XT',
3: 'JJS',
4: 'E2A',
5: 'WRB',
6: 'VB',
7: 'TO',
8: 'VBP',
9: 'FW',
10: 'EX',
11: 'VBN',
12: 'VBZ',
13: 'NNS',
14: 'VBG',
15: 'RBR',
16: 'WP',
17: 'CT',
18: 'PRP',
19: 'JJR',
20: 'CC',
21: 'NNPS',
22: 'CD',
23: 'DT',
24: 'NNP',
25: 'PDT',
26: 'LS',
27: 'PP',
28: 'PRP$',
29: 'NN',
30: 'JJ',
31: 'RP',
32: 'RBS',
33: 'MD',
34: 'WP$',
35: 'RB',
36: 'SYM',
37: 'IN',
38: 'PUNCT',
39: 'WDT',
40: 'POS',
41: '<pad>'}
```
Parameters:
```
nepochs = 30 (stop at 18th)
batch_size = 32
batch_status = 32
learning_rate = 1e-5
early_stop = 3
max_length = 200
checkpoint: dmis-lab/biobert-base-cased-v1.2
```
See more in: https://github.com/lisaterumi/postagger-bio-english |