cc1f64d7b82d49db2596a8afa29e9205

This model is a fine-tuned version of FacebookAI/xlm-roberta-large-finetuned-conll03-english on the nyu-mll/glue [mnli] dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Accuracy	F1 Macro	Rouge1	Rougel	Rougelsum
No log	0	0	1.1105	0	13.4701	0.3190	0.2654	0.3189	0.3189	0.3193
1.1223	1	12271	1.1043	0.0078	28.4816	0.3182	0.1609	0.3184	0.3182	0.3183
1.115	2	24542	1.0970	0.0156	43.9428	0.3545	0.1745	0.3544	0.3545	0.3543
1.1053	3	36813	1.1007	0.0312	73.5944	0.3273	0.1644	0.3273	0.3275	0.3277
1.1133	4	49084	1.0982	0.0625	131.4488	0.3545	0.1745	0.3544	0.3545	0.3543
1.1115	5	61355	1.1036	0.125	246.8115	0.3545	0.1745	0.3544	0.3545	0.3543
1.1045	6	73626	1.1037	0.25	482.9354	0.3273	0.1644	0.3273	0.3275	0.3277

Safetensors

Model size

0.6B params

Tensor type

F32

Base model

Finetuned

(24)

this model