bertweet-base_wnut17_ner / README.md

napsternxg

Librarian Bot: Add base_model information to model (#2)

6949f99 11 months ago

preview code

raw

history blame contribute delete

No virus

6.32 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	- named-entity-recognition
	- token-classification
	datasets:
	- wnut_17
	metrics:
	- precision
	- recall
	- f1
	- accuracy
	base_model: vinai/bertweet-base
	model-index:
	- name: fine_tune_bertweet-base-lp-ft
	results:
	- task:
	type: token-classification
	name: Token Classification
	dataset:
	name: wnut_17
	type: wnut_17
	args: semval
	metrics:
	- type: precision
	value: 0.6154830454254638
	name: Precision
	- type: recall
	value: 0.49844559585492226
	name: Recall
	- type: f1
	value: 0.5508159175493844
	name: F1
	- type: accuracy
	value: 0.9499198834668608
	name: Accuracy
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Bertweet-base finetuned on wnut17_ner

	This model is a fine-tuned version of [vinai/bertweet-base](https://huggingface.co/vinai/bertweet-base) on the [wnut_17](https://huggingface.co/datasets/wnut_17) dataset.

	It achieves the following results on the evaluation set:
	- Loss: 0.3376
	- Overall Precision: 0.6803
	- Overall Recall: 0.6096
	- Overall F1: 0.6430
	- Overall Accuracy: 0.9509
	- Corporation F1: 0.2975
	- Creative-work F1: 0.4436
	- Group F1: 0.3624
	- Location F1: 0.6834
	- Person F1: 0.7902
	- Product F1: 0.3887

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 100

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Overall Precision \| Overall Recall \| Overall F1 \| Overall Accuracy \| Corporation F1 \| Creative-work F1 \| Group F1 \| Location F1 \| Person F1 \| Product F1 \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-----------------:\|:--------------:\|:----------:\|:----------------:\|:--------------:\|:----------------:\|:--------:\|:-----------:\|:---------:\|:----------:\|
	\| 0.0215 \| 1.0 \| 213 \| 0.2913 \| 0.7026 \| 0.5905 \| 0.6417 \| 0.9507 \| 0.2832 \| 0.4444 \| 0.2975 \| 0.6854 \| 0.7788 \| 0.4015 \|
	\| 0.0213 \| 2.0 \| 426 \| 0.3052 \| 0.6774 \| 0.5772 \| 0.6233 \| 0.9495 \| 0.2830 \| 0.3483 \| 0.3231 \| 0.6857 \| 0.7728 \| 0.3794 \|
	\| 0.0288 \| 3.0 \| 639 \| 0.3378 \| 0.7061 \| 0.5507 \| 0.6188 \| 0.9467 \| 0.3077 \| 0.4184 \| 0.3529 \| 0.6222 \| 0.7532 \| 0.3910 \|
	\| 0.0124 \| 4.0 \| 852 \| 0.2712 \| 0.6574 \| 0.6121 \| 0.6340 \| 0.9502 \| 0.3077 \| 0.4842 \| 0.3167 \| 0.6809 \| 0.7735 \| 0.3986 \|
	\| 0.0208 \| 5.0 \| 1065 \| 0.2905 \| 0.7108 \| 0.6063 \| 0.6544 \| 0.9518 \| 0.3063 \| 0.4286 \| 0.3419 \| 0.7052 \| 0.7913 \| 0.4223 \|
	\| 0.0071 \| 6.0 \| 1278 \| 0.3189 \| 0.6756 \| 0.5847 \| 0.6269 \| 0.9494 \| 0.2759 \| 0.4380 \| 0.3256 \| 0.6744 \| 0.7781 \| 0.3779 \|
	\| 0.0073 \| 7.0 \| 1491 \| 0.3593 \| 0.7330 \| 0.5540 \| 0.6310 \| 0.9476 \| 0.3061 \| 0.4388 \| 0.3784 \| 0.6946 \| 0.7631 \| 0.3374 \|
	\| 0.0135 \| 8.0 \| 1704 \| 0.3564 \| 0.6875 \| 0.5482 \| 0.6100 \| 0.9471 \| 0.34 \| 0.4179 \| 0.3088 \| 0.6632 \| 0.7486 \| 0.3695 \|
	\| 0.0097 \| 9.0 \| 1917 \| 0.3085 \| 0.6598 \| 0.6395 \| 0.6495 \| 0.9516 \| 0.3111 \| 0.4609 \| 0.3836 \| 0.7090 \| 0.7906 \| 0.4083 \|
	\| 0.0108 \| 10.0 \| 2130 \| 0.3045 \| 0.6605 \| 0.6478 \| 0.6541 \| 0.9509 \| 0.3529 \| 0.4580 \| 0.3649 \| 0.6897 \| 0.7843 \| 0.4387 \|
	\| 0.013 \| 11.0 \| 2343 \| 0.3383 \| 0.6788 \| 0.6179 \| 0.6470 \| 0.9507 \| 0.2783 \| 0.4248 \| 0.3358 \| 0.7368 \| 0.7958 \| 0.3655 \|
	\| 0.0076 \| 12.0 \| 2556 \| 0.3617 \| 0.6920 \| 0.5523 \| 0.6143 \| 0.9474 \| 0.2708 \| 0.3985 \| 0.3333 \| 0.6740 \| 0.7566 \| 0.3525 \|
	\| 0.0042 \| 13.0 \| 2769 \| 0.3747 \| 0.6896 \| 0.5664 \| 0.6220 \| 0.9473 \| 0.2478 \| 0.3915 \| 0.3521 \| 0.6561 \| 0.7742 \| 0.3539 \|
	\| 0.0049 \| 14.0 \| 2982 \| 0.3376 \| 0.6803 \| 0.6096 \| 0.6430 \| 0.9509 \| 0.2975 \| 0.4436 \| 0.3624 \| 0.6834 \| 0.7902 \| 0.3887 \|


	### Overall results

	\| metric_type \| train \| validation \| test \|
	\|:-------------------\|-----------:\|-----------:\|-----------:\|
	\| loss \| 0.012030 \| 0.271155 \| 0.273943 \|
	\| runtime \| 16.292400 \| 5.068800 \| 8.596800 \|
	\| samples_per_second \| 208.318000 \| 199.060000 \| 149.707000 \|
	\| steps_per_second \| 13.074000 \| 12.626000 \| 9.422000 \|
	\| corporation_f1 \| 0.936877 \| 0.307692 \| 0.368627 \|
	\| person_f1 \| 0.984252 \| 0.773455 \| 0.689826 \|
	\| product_f1 \| 0.893246 \| 0.398625 \| 0.270423 \|
	\| creative-work_f1 \| 0.880562 \| 0.484211 \| 0.415274 \|
	\| group_f1 \| 0.975547 \| 0.316667 \| 0.411348 \|
	\| location_f1 \| 0.978887 \| 0.680851 \| 0.638695 \|
	\| overall_accuracy \| 0.997709 \| 0.950244 \| 0.949920 \|
	\| overall_f1 \| 0.961113 \| 0.633978 \| 0.550816 \|
	\| overall_precision \| 0.956337 \| 0.657449 \| 0.615483 \|
	\| overall_recall \| 0.965938 \| 0.612126 \| 0.498446 \|


	### Framework versions

	- Transformers 4.17.0
	- Pytorch 1.11.0+cu113
	- Datasets 2.0.0
	- Tokenizers 0.11.6