File size: 2,896 Bytes
112da14
 
 
 
 
 
c2e81cd
 
 
112da14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
license: mit
tags:
- generated_from_trainer
datasets:
- jnlpba
widget:
- text: "The widespread circular form of DNA molecules inside cells creates very serious topological problems during replication. Due to the helical structure of the double helix the parental strands of circular DNA form a link of very high order, and yet they have to be unlinked before the cell division."
- text: "It consists of 25 exons encoding a 1,278-amino acid glycoprotein that is composed of 13 transmembrane domains"
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: pubmedbert-finetuned-ner
  results:
  - task:
      name: Token Classification
      type: token-classification
    dataset:
      name: jnlpba
      type: jnlpba
      config: jnlpba
      split: train
      args: jnlpba
    metrics:
    - name: Precision
      type: precision
      value: 0.6877153861747415
    - name: Recall
      type: recall
      value: 0.7833063957515586
    - name: F1
      type: f1
      value: 0.7324050086355786
    - name: Accuracy
      type: accuracy
      value: 0.926729986431479
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# pubmedbert-finetuned-ner

This model is a fine-tuned version of [microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext) on the jnlpba dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3766
- Precision: 0.6877
- Recall: 0.7833
- F1: 0.7324
- Accuracy: 0.9267

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Precision | Recall | F1     | Accuracy |
|:-------------:|:-----:|:-----:|:---------------:|:---------:|:------:|:------:|:--------:|
| 0.1607        | 1.0   | 2319  | 0.2241          | 0.6853    | 0.7835 | 0.7311 | 0.9302   |
| 0.112         | 2.0   | 4638  | 0.2620          | 0.6753    | 0.7929 | 0.7294 | 0.9276   |
| 0.0785        | 3.0   | 6957  | 0.3014          | 0.6948    | 0.7731 | 0.7319 | 0.9268   |
| 0.055         | 4.0   | 9276  | 0.3526          | 0.6898    | 0.7801 | 0.7322 | 0.9268   |
| 0.0418        | 5.0   | 11595 | 0.3766          | 0.6877    | 0.7833 | 0.7324 | 0.9267   |


### Framework versions

- Transformers 4.21.1
- Pytorch 1.12.1+cu113
- Datasets 2.4.0
- Tokenizers 0.12.1