taehyunzzz
/

switch-base-8-samsum-ba16-lr1e-04-top-2-choose-1

Text2Text Generation

switch_transformers

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

Edit model card

switch-base-8-samsum-ba16-lr1e-04-top-2-choose-1

This model is a fine-tuned version of google/switch-base-8 on the samsum dataset. It achieves the following results on the evaluation set:

Loss: 1.5912
Rouge1: 50.0097
Rouge2: 25.9731
Rougel: 41.7903
Rougelsum: 46.3722
Gen Len: 22.2347

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.1
num_epochs: 5.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
2.2794	0.5429	500	1.9027	43.093	20.2653	36.0953	39.9306	19.1553
1.9316	1.0858	1000	1.7250	46.9261	22.2483	38.479	43.0242	22.0636
1.8271	1.6287	1500	1.6592	47.8811	24.297	39.9511	44.3462	21.8301
1.6252	2.1716	2000	1.6083	48.5006	24.4846	40.6387	44.8987	21.7775
1.6832	2.7144	2500	1.5831	47.9987	24.2297	40.1482	44.3065	19.3594
1.4585	3.2573	3000	1.6166	49.9246	25.9232	41.8637	46.2275	22.4242
1.5714	3.8002	3500	1.5817	49.4227	25.4827	41.4224	46.0025	22.3423
1.3661	4.3431	4000	1.6149	50.1332	26.002	42.11	46.4761	22.1222
1.4259	4.8860	4500	1.5912	50.0097	25.9731	41.7903	46.3722	22.2347

Framework versions

Transformers 4.41.2
Pytorch 2.1.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 5

Safetensors

Model size

619M params

Tensor type

F32

·

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Dataset used to train taehyunzzz/switch-base-8-samsum-ba16-lr1e-04-top-2-choose-1

Evaluation results

Rouge1 on samsum
validation set self-reported

50.010

View on Papers With Code