learn3r
/

bart_large_gov

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

bart_large_gov / README.md

learn3r's picture

Model save

f37cbda verified 9 months ago

|

3.6 kB

	---
	license: apache-2.0
	base_model: facebook/bart-large
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: bart_large_gov
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bart_large_gov

	This model is a fine-tuned version of [facebook/bart-large](https://huggingface.co/facebook/bart-large) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.6328
	- Rouge1: 55.8983
	- Rouge2: 30.5018
	- Rougel: 38.7764
	- Rougelsum: 51.3611
	- Gen Len: 128.7407

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 20.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:--------:\|
	\| 1.6694 \| 1.0 \| 136 \| 1.5338 \| 54.2061 \| 29.3577 \| 37.2911 \| 49.8337 \| 139.7253 \|
	\| 1.5178 \| 1.99 \| 272 \| 1.4698 \| 55.6621 \| 30.6254 \| 38.7491 \| 51.2934 \| 128.9475 \|
	\| 1.4208 \| 3.0 \| 409 \| 1.4487 \| 55.4905 \| 30.4201 \| 38.412 \| 51.1108 \| 129.5658 \|
	\| 1.3399 \| 3.99 \| 545 \| 1.4450 \| 56.2783 \| 31.1387 \| 39.2121 \| 51.8068 \| 128.5062 \|
	\| 1.2326 \| 5.0 \| 682 \| 1.4478 \| 56.0182 \| 30.7104 \| 38.8337 \| 51.6162 \| 129.1358 \|
	\| 1.1784 \| 6.0 \| 818 \| 1.4533 \| 56.4333 \| 31.4483 \| 39.5546 \| 52.1347 \| 128.7315 \|
	\| 1.1739 \| 7.0 \| 955 \| 1.4607 \| 56.3636 \| 31.1125 \| 39.4055 \| 51.9709 \| 128.8241 \|
	\| 1.1585 \| 8.0 \| 1091 \| 1.4774 \| 55.9356 \| 30.7012 \| 38.7824 \| 51.5664 \| 128.9640 \|
	\| 1.0297 \| 8.99 \| 1227 \| 1.4939 \| 56.7487 \| 31.552 \| 39.6461 \| 52.411 \| 128.6553 \|
	\| 1.0085 \| 10.0 \| 1364 \| 1.5075 \| 56.3918 \| 31.2201 \| 39.4213 \| 51.9449 \| 128.6265 \|
	\| 0.9738 \| 10.99 \| 1500 \| 1.5237 \| 56.3041 \| 30.9239 \| 39.2625 \| 51.8217 \| 128.8282 \|
	\| 0.9583 \| 12.0 \| 1637 \| 1.5444 \| 55.6539 \| 30.2395 \| 38.5901 \| 51.1518 \| 128.9136 \|
	\| 0.9601 \| 12.99 \| 1773 \| 1.5516 \| 55.9154 \| 30.5471 \| 38.8607 \| 51.2856 \| 128.9784 \|
	\| 0.8882 \| 14.0 \| 1910 \| 1.5736 \| 56.3282 \| 30.9807 \| 39.2351 \| 51.8022 \| 128.5206 \|
	\| 0.851 \| 15.0 \| 2046 \| 1.5891 \| 56.0531 \| 30.6748 \| 38.8847 \| 51.5739 \| 128.7623 \|
	\| 0.8825 \| 16.0 \| 2183 \| 1.5978 \| 56.0084 \| 30.7943 \| 38.9692 \| 51.5587 \| 128.7798 \|
	\| 0.8169 \| 17.0 \| 2319 \| 1.6076 \| 55.8274 \| 30.41 \| 38.6258 \| 51.3009 \| 128.8632 \|
	\| 0.8194 \| 17.99 \| 2455 \| 1.6177 \| 56.3214 \| 30.9896 \| 39.4754 \| 51.9525 \| 128.6461 \|
	\| 0.8441 \| 19.0 \| 2592 \| 1.6260 \| 55.9842 \| 30.6332 \| 38.999 \| 51.5685 \| 128.8241 \|
	\| 0.792 \| 19.94 \| 2720 \| 1.6328 \| 55.8983 \| 30.5018 \| 38.7764 \| 51.3611 \| 128.7407 \|


	### Framework versions

	- Transformers 4.37.0.dev0
	- Pytorch 2.0.1+cu117
	- Datasets 2.14.5
	- Tokenizers 0.15.0