multiCorp_5e-05_0404 / README.md

Brizape

Update README.md

89fe2a5 over 1 year ago

preview code

raw

history blame

No virus

5.61 kB

	---
	license: mit
	tags:
	- generated_from_trainer
	model-index:
	- name: multiCorp_5e-05_0404
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# multiCorp_5e-05_0404

	This model is a fine-tuned version of [microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext) on the None dataset.
	It achieves the following results on the evaluation set:
	- eval_loss: 0.0657
	- eval_precision: 0.6398
	- eval_recall: 0.6267
	- eval_f1: 0.6332
	- eval_accuracy: 0.9847
	- eval_runtime: 39.7302
	- eval_samples_per_second: 32.544
	- eval_steps_per_second: 2.039
	- epoch: 3.41
	- step: 1100


	Multi Corp Training,

	model = AutoModelForTokenClassification.from_pretrained(
	"microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext", num_labels=41, id2label=id2label, label2id=label2id
	)

	training_args = TrainingArguments(
	report_to = 'wandb', # enable logging to W&B
	output_dir = runname, # output directory/ name for huggingface hub
	learning_rate=5e-5,
	per_device_train_batch_size=16,
	per_device_eval_batch_size=16,
	weight_decay=0.01,
	evaluation_strategy = 'steps', # check evaluation metrics at each epoch
	max_steps = 2000,
	logging_steps = 25, # we will log every 25 steps
	eval_steps = 25, # we will perform evaluation every 25 steps
	save_steps = 25,
	load_best_model_at_end=True,
	metric_for_best_model = 'eval_loss',
	greater_is_better = False,
	push_to_hub=True,
	run_name = runname # name of the W&B run
	)

	trainer = Trainer(
	model=model,
	args=training_args,
	train_dataset=tokenized_data["train"],
	eval_dataset=tokenized_data["validation"],
	tokenizer=tokenizer,
	data_collator=data_collator,
	compute_metrics=compute_metrics,
	callbacks = [EarlyStoppingCallback(early_stopping_patience=6)]
	)

	[1101/2000 1:00:33 < 49:32, 0.30 it/s, Epoch 3.41/7]

	25 0.836100 0.201612 0.000000 0.000000 0.000000 0.973546
	50 0.149500 0.154239 0.233246 0.124420 0.162277 0.972420
	75 0.136300 0.138105 0.145299 0.094708 0.114671 0.972385
	100 0.129900 0.123477 0.425243 0.203343 0.275126 0.975886
	125 0.103100 0.118570 0.297553 0.321727 0.309168 0.974136
	150 0.117300 0.113230 0.393373 0.214949 0.277995 0.977039
	175 0.117500 0.106183 0.320082 0.291551 0.305151 0.975930
	200 0.093800 0.102443 0.353604 0.291551 0.319593 0.975297
	225 0.091900 0.105976 0.446684 0.318942 0.372156 0.977127
	250 0.088700 0.093393 0.439173 0.335190 0.380200 0.977734
	275 0.113300 0.097715 0.522222 0.218199 0.307793 0.977637
	300 0.092900 0.085730 0.473552 0.349118 0.401924 0.979405
	325 0.085700 0.091731 0.380009 0.409471 0.394190 0.976960
	350 0.081700 0.086656 0.554161 0.389508 0.457470 0.980162
	375 0.062400 0.083441 0.538000 0.374652 0.441708 0.980769
	400 0.077500 0.085072 0.486742 0.477252 0.481950 0.978869
	425 0.073000 0.078521 0.516658 0.467967 0.491108 0.981103
	450 0.081000 0.077073 0.552381 0.430826 0.484090 0.981288
	475 0.075100 0.078478 0.483887 0.446147 0.464251 0.980408
	500 0.062800 0.073298 0.550633 0.484680 0.515556 0.982247
	525 0.060600 0.069571 0.542723 0.536676 0.539683 0.982608
	550 0.063900 0.071559 0.539832 0.506500 0.522635 0.981983
	575 0.060700 0.068333 0.564646 0.519034 0.540881 0.982546
	600 0.062900 0.072810 0.602013 0.416435 0.492316 0.981886
	625 0.051300 0.071469 0.550901 0.525070 0.537675 0.982335
	650 0.059500 0.067657 0.553466 0.478180 0.513076 0.982528
	675 0.047500 0.067443 0.594739 0.566852 0.580461 0.983663
	700 0.052100 0.065269 0.564447 0.546890 0.555529 0.983039
	725 0.041500 0.067790 0.593516 0.552461 0.572253 0.983672
	750 0.046300 0.067922 0.609038 0.538069 0.571358 0.983461
	775 0.054300 0.064636 0.646725 0.582173 0.612753 0.984499
	800 0.049500 0.067722 0.650905 0.517642 0.576674 0.983830
	825 0.043100 0.069327 0.630043 0.471216 0.539177 0.982880
	850 0.048000 0.063814 0.631025 0.528784 0.575398 0.984068
	875 0.042500 0.064527 0.644913 0.582637 0.612195 0.984543
	900 0.043500 0.065475 0.608295 0.490251 0.542931 0.983522
	925 0.039200 0.066043 0.635938 0.566852 0.599411 0.984323
	950 0.046800 0.062491 0.646930 0.547818 0.593263 0.984719
	975 0.043700 0.061204 0.634625 0.585422 0.609032 0.984543
	1000 0.032000 0.066377 0.643390 0.560353 0.599007 0.984349
	1025 0.038100 0.064764 0.666482 0.559424 0.608279 0.984745
	1050 0.035300 0.065642 0.635359 0.587279 0.610374 0.984464
	1075 0.032800 0.064835 0.657262 0.584030 0.618486 0.984587
	1100 0.031700 0.065726 0.639810 0.626741 0.633208 0.984710

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- training_steps: 2000

	### Framework versions

	- Transformers 4.27.4
	- Pytorch 2.0.0+cu118
	- Datasets 2.11.0
	- Tokenizers 0.13.2