bert-linnaeus-ner
This model is a fine-tuned version of bert-base-cased on the linnaeus dataset. It achieves the following results on the evaluation set:
- Loss: 0.0073
- Precision: 0.9223
- Recall: 0.9522
- F1: 0.9370
- Accuracy: 0.9985
Model description
This model can be used to find organisms and species in text data.
NB. THIS MODEL IS WIP AND IS SUBJECT TO CHANGE!
Intended uses & limitations
This model's intended use is in my Master's thesis to mask names of bacteria (and phages) for further analysis.
Training and evaluation data
Linnaeus dataset was used to train and validate the performance.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
0.0076 | 1.0 | 1492 | 0.0128 | 0.8566 | 0.9578 | 0.9044 | 0.9967 |
0.0024 | 2.0 | 2984 | 0.0082 | 0.9092 | 0.9578 | 0.9329 | 0.9980 |
0.0007 | 3.0 | 4476 | 0.0073 | 0.9223 | 0.9522 | 0.9370 | 0.9985 |
Framework versions
- Transformers 4.34.0
- Pytorch 2.1.0+cu121
- Datasets 2.14.5
- Tokenizers 0.14.0
- Downloads last month
- 115
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for mikrz/bert-linnaeus-ner
Base model
google-bert/bert-base-casedDataset used to train mikrz/bert-linnaeus-ner
Evaluation results
- Precision on linnaeusvalidation set self-reported0.922
- Recall on linnaeusvalidation set self-reported0.952
- F1 on linnaeusvalidation set self-reported0.937
- Accuracy on linnaeusvalidation set self-reported0.999