theojolliffe
/

bart-cnn-pubmed-arxiv-pubmed-arxiv-arxiv-v3-e16

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

bart-cnn-pubmed-arxiv-pubmed-arxiv-arxiv-v3-e16 / README.md

theojolliffe's picture

update model card README.md

2113c7a over 2 years ago

|

history blame contribute delete

3.23 kB

	---
	license: mit
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: bart-cnn-pubmed-arxiv-pubmed-arxiv-arxiv-v3-e16
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bart-cnn-pubmed-arxiv-pubmed-arxiv-arxiv-v3-e16

	This model is a fine-tuned version of [theojolliffe/bart-cnn-pubmed-arxiv-pubmed-arxiv-arxiv](https://huggingface.co/theojolliffe/bart-cnn-pubmed-arxiv-pubmed-arxiv-arxiv) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.8960
	- Rouge1: 57.7198
	- Rouge2: 44.5711
	- Rougel: 47.6281
	- Rougelsum: 56.2372
	- Gen Len: 142.0

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 16
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|
	\| No log \| 1.0 \| 398 \| 0.8634 \| 53.7416 \| 34.3731 \| 37.1193 \| 51.3075 \| 142.0 \|
	\| 0.8276 \| 2.0 \| 796 \| 0.8001 \| 53.9975 \| 35.1019 \| 38.2722 \| 51.7878 \| 142.0 \|
	\| 0.5311 \| 3.0 \| 1194 \| 0.7988 \| 53.409 \| 34.3201 \| 37.5443 \| 50.738 \| 142.0 \|
	\| 0.3538 \| 4.0 \| 1592 \| 0.7698 \| 53.679 \| 34.7209 \| 37.7895 \| 51.2497 \| 142.0 \|
	\| 0.3538 \| 5.0 \| 1990 \| 0.7863 \| 54.2493 \| 36.0643 \| 39.1249 \| 51.9758 \| 142.0 \|
	\| 0.2367 \| 6.0 \| 2388 \| 0.7810 \| 54.4042 \| 37.4276 \| 41.529 \| 52.1544 \| 142.0 \|
	\| 0.164 \| 7.0 \| 2786 \| 0.8055 \| 56.0408 \| 39.6744 \| 42.8323 \| 54.163 \| 142.0 \|
	\| 0.1146 \| 8.0 \| 3184 \| 0.8098 \| 55.2046 \| 38.5399 \| 41.9178 \| 53.0001 \| 142.0 \|
	\| 0.089 \| 9.0 \| 3582 \| 0.8199 \| 57.1523 \| 41.7614 \| 44.5914 \| 55.1602 \| 142.0 \|
	\| 0.089 \| 10.0 \| 3980 \| 0.8644 \| 56.943 \| 41.5063 \| 44.4929 \| 54.9515 \| 142.0 \|
	\| 0.0647 \| 11.0 \| 4378 \| 0.8413 \| 57.0321 \| 41.964 \| 45.3971 \| 55.0957 \| 142.0 \|
	\| 0.0485 \| 12.0 \| 4776 \| 0.8735 \| 56.7275 \| 41.8577 \| 44.3911 \| 54.9824 \| 142.0 \|
	\| 0.0365 \| 13.0 \| 5174 \| 0.8858 \| 57.6103 \| 43.8831 \| 47.0374 \| 56.0675 \| 142.0 \|
	\| 0.0271 \| 14.0 \| 5572 \| 0.8974 \| 57.39 \| 42.8693 \| 45.9344 \| 55.7404 \| 142.0 \|
	\| 0.0271 \| 15.0 \| 5970 \| 0.8990 \| 57.9433 \| 44.7301 \| 47.843 \| 56.5407 \| 142.0 \|
	\| 0.0232 \| 16.0 \| 6368 \| 0.8960 \| 57.7198 \| 44.5711 \| 47.6281 \| 56.2372 \| 142.0 \|


	### Framework versions

	- Transformers 4.19.2
	- Pytorch 1.11.0+cu113
	- Datasets 2.2.2
	- Tokenizers 0.12.1