taehyunzzz
/

switch-base-16-samsum-ba16-lr1e-04-top-4-choose-1

Text2Text Generation

switch_transformers

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

Edit model card

switch-base-16-samsum-ba16-lr1e-4-top-4-choose-1

This model is a fine-tuned version of google/switch-base-16 on the samsum samsum dataset. It achieves the following results on the evaluation set:

Loss: 1.5674
Rouge1: 50.9015
Rouge2: 26.4267
Rougel: 42.4807
Rougelsum: 47.065
Gen Len: 24.2139

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.1
num_epochs: 5.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
2.4108	0.4343	400	1.9721	42.5812	19.2812	35.4748	39.2541	19.2506
2.0404	0.8686	800	1.7618	46.3109	22.6628	39.0055	42.5976	19.055
1.9542	1.3029	1200	1.6832	47.8204	24.0498	40.4763	44.3711	20.6822
1.8495	1.7372	1600	1.6298	47.4526	24.2324	39.9469	43.7808	19.72
1.6194	2.1716	2000	1.6185	49.8848	25.1915	41.8312	46.4885	25.4976
1.652	2.6059	2400	1.6123	48.5854	24.7701	40.9614	44.9415	22.4682
1.4582	3.0402	2800	1.5927	49.7032	25.2983	41.5462	45.8574	22.9254
1.5009	3.4745	3200	1.5974	50.3499	26.122	42.1925	46.5237	24.0758
1.4458	3.9088	3600	1.5765	51.2501	26.6203	42.8644	47.4187	24.0061
1.3603	4.3431	4000	1.6234	51.5462	27.0733	43.1304	47.7257	25.4377
1.4493	4.7774	4400	1.5644	50.8852	26.6632	42.6825	47.0773	25.1797

Framework versions

Transformers 4.41.2
Pytorch 2.1.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 2

Safetensors

Model size

1.07B params

Tensor type

F32

·

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Dataset used to train taehyunzzz/switch-base-16-samsum-ba16-lr1e-04-top-4-choose-1

Collection including taehyunzzz/switch-base-16-samsum-ba16-lr1e-04-top-4-choose-1

Samsum Finetuning

Finetune dense~switch models on Samsum • 7 items • Updated 11 days ago

Evaluation results

Rouge1 on samsum samsum
validation set self-reported

50.901

View on Papers With Code