taehyunzzz
/

switch-base-8-samsum-ba16-lr1e-04-top-1

Text2Text Generation

switch_transformers

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

Edit model card

switch-base-8-samsum-ba16-lr0.0001-top-1

This model is a fine-tuned version of google/switch-base-8 on the samsum samsum dataset. It achieves the following results on the evaluation set:

Loss: 1.5773
Rouge1: 50.2274
Rouge2: 26.1297
Rougel: 41.6305
Rougelsum: 46.212
Gen Len: 26.1064

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.1
num_epochs: 5.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
2.5688	0.4343	400	2.0174	39.045	18.0039	32.4777	36.2922	20.7787
2.054	0.8686	800	1.7355	45.9504	22.2553	37.9856	42.4096	24.9254
1.9326	1.3029	1200	1.6762	46.4474	22.8747	39.0668	42.8255	20.0905
1.8121	1.7372	1600	1.6212	47.4383	23.9879	39.9156	44.1151	21.1174
1.6303	2.1716	2000	1.6068	49.5797	25.5351	41.2855	45.9999	24.8362
1.6817	2.6059	2400	1.5734	49.0904	24.9926	41.2085	45.4779	23.0905
1.4335	3.0402	2800	1.5943	49.5091	25.7276	41.7665	45.7573	22.1553
1.5042	3.4745	3200	1.5807	49.1947	25.6961	41.2511	45.5553	22.6149
1.4447	3.9088	3600	1.5747	50.1246	26.1223	41.8475	46.3095	25.5709
1.3638	4.3431	4000	1.6004	50.655	26.3528	42.3721	46.8937	25.0685
1.4508	4.7774	4400	1.5741	49.9176	25.9264	41.7646	46.0881	24.879

Framework versions

Transformers 4.41.2
Pytorch 2.1.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 2

Safetensors

Model size

619M params

Tensor type

F32

·

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Dataset used to train taehyunzzz/switch-base-8-samsum-ba16-lr1e-04-top-1

Collection including taehyunzzz/switch-base-8-samsum-ba16-lr1e-04-top-1

Samsum Finetuning

Finetune dense~switch models on Samsum • 7 items • Updated 11 days ago

Evaluation results

Rouge1 on samsum samsum
validation set self-reported

50.227

View on Papers With Code