README.md · eslamxm/AraT5-base-title-generation-finetune-ar-xlsum at main

AraT5-base-title-generation-finetune-ar-xlsum / README.md

eslamxm

update model card README.md

949723d almost 2 years ago

preview code

raw history blame contribute delete

No virus

2.71 kB

	---
	tags:
	- summarization
	- Arat5-base
	- abstractive summarization
	- ar
	- xlsum
	- generated_from_trainer
	datasets:
	- xlsum
	model-index:
	- name: AraT5-base-title-generation-finetune-ar-xlsum
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# AraT5-base-title-generation-finetune-ar-xlsum

	This model is a fine-tuned version of [UBC-NLP/AraT5-base-title-generation](https://huggingface.co/UBC-NLP/AraT5-base-title-generation) on the xlsum dataset.
	It achieves the following results on the evaluation set:
	- Loss: 4.2837
	- Rouge-1: 32.46
	- Rouge-2: 15.15
	- Rouge-l: 28.38
	- Gen Len: 18.48
	- Bertscore: 74.24

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 250
	- num_epochs: 10
	- label_smoothing_factor: 0.1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge-1 \| Rouge-2 \| Rouge-l \| Gen Len \| Bertscore \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:-------:\|:---------:\|
	\| 5.815 \| 1.0 \| 293 \| 4.7437 \| 27.05 \| 10.49 \| 23.56 \| 18.03 \| 72.56 \|
	\| 5.0818 \| 2.0 \| 586 \| 4.5004 \| 28.92 \| 11.97 \| 25.09 \| 18.61 \| 73.08 \|
	\| 4.7855 \| 3.0 \| 879 \| 4.3910 \| 29.66 \| 12.57 \| 25.79 \| 18.58 \| 73.3 \|
	\| 4.588 \| 4.0 \| 1172 \| 4.3469 \| 30.22 \| 13.05 \| 26.36 \| 18.59 \| 73.61 \|
	\| 4.4388 \| 5.0 \| 1465 \| 4.3226 \| 30.88 \| 13.81 \| 27.01 \| 18.65 \| 73.78 \|
	\| 4.3162 \| 6.0 \| 1758 \| 4.2990 \| 30.9 \| 13.6 \| 26.92 \| 18.68 \| 73.78 \|
	\| 4.2178 \| 7.0 \| 2051 \| 4.2869 \| 31.35 \| 14.01 \| 27.41 \| 18.57 \| 73.96 \|
	\| 4.1387 \| 8.0 \| 2344 \| 4.2794 \| 31.28 \| 13.98 \| 27.34 \| 18.6 \| 73.87 \|
	\| 4.0787 \| 9.0 \| 2637 \| 4.2806 \| 31.45 \| 14.17 \| 27.46 \| 18.66 \| 73.97 \|
	\| 4.0371 \| 10.0 \| 2930 \| 4.2837 \| 31.55 \| 14.19 \| 27.52 \| 18.65 \| 74.0 \|


	### Framework versions

	- Transformers 4.20.0
	- Pytorch 1.11.0+cu113
	- Datasets 2.3.2
	- Tokenizers 0.12.1