Disfluency Labeling - Ariel Cerda

This model is a fine-tuned version of FacebookAI/roberta-large on the TimeStamped dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6249
  • Precision: 0.0
  • Recall: 0.0
  • F1: 0.0
  • Accuracy: 0.9075

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 8
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
0.2995 0.2194 500 0.3309 0.9467 0.3143 0.4720 0.9365
0.354 0.4388 1000 0.2799 0.9079 0.3216 0.4750 0.9366
0.3717 0.6582 1500 0.2715 0.9458 0.3351 0.4948 0.9381
0.3362 0.8776 2000 0.3026 0.9153 0.3195 0.4737 0.9363
0.2668 1.0970 2500 0.3130 0.9519 0.3320 0.4923 0.9376
0.3311 1.3164 3000 0.2815 0.9687 0.3299 0.4922 0.9379
0.3345 1.5358 3500 0.3048 0.9976 0.2506 0.4006 0.9307
0.2945 1.7552 4000 0.2890 0.9621 0.3326 0.4943 0.9378
0.2648 1.9746 4500 0.2850 0.9740 0.3311 0.4942 0.9380
0.3272 2.1939 5000 0.2827 0.9657 0.3430 0.5062 0.9388
0.3161 2.4133 5500 0.2759 0.9237 0.3357 0.4924 0.9367
0.2687 2.6327 6000 0.2891 0.9757 0.3308 0.4941 0.9381
0.2948 2.8521 6500 0.2872 0.9784 0.3177 0.4796 0.9368
0.2608 3.0715 7000 0.2901 0.8284 0.3445 0.4866 0.9338
0.2947 3.2909 7500 0.2829 0.9572 0.3341 0.4954 0.9379
0.2939 3.5103 8000 0.2814 0.9702 0.3277 0.4900 0.9377
0.2581 3.7297 8500 0.2764 0.9757 0.3311 0.4944 0.9381
0.3108 3.9491 9000 0.2809 0.9721 0.3293 0.4919 0.9379
0.2929 4.1685 9500 0.2874 0.9737 0.3274 0.4901 0.9377
0.2939 4.3879 10000 0.2760 0.9689 0.3323 0.4949 0.9381
0.3173 4.6073 10500 0.2784 0.9722 0.3311 0.4940 0.9381
0.2784 4.8267 11000 0.2825 0.9709 0.3360 0.4992 0.9384
0.2593 5.0461 11500 0.2775 0.9724 0.3335 0.4967 0.9383
0.2507 5.2655 12000 0.2985 0.9708 0.3348 0.4978 0.9383
0.2707 5.4849 12500 0.2805 0.9714 0.3421 0.5060 0.9389
0.2775 5.7043 13000 0.2757 0.9697 0.3421 0.5057 0.9390
0.5178 5.9237 13500 0.4682 0.9052 0.0845 0.1545 0.9151
0.3553 6.1430 14000 0.3657 0.9574 0.1988 0.3292 0.9257
0.3496 6.3624 14500 0.3986 0.9565 0.1945 0.3233 0.9253
0.3452 6.5818 15000 0.4337 0.0 0.0 0.0 0.9075
0.3931 6.8012 15500 0.5834 0.0 0.0 0.0 0.9075
0.4035 7.0206 16000 0.5584 0.0 0.0 0.0 0.9075
0.3831 7.2400 16500 0.5585 0.0 0.0 0.0 0.9075
0.2817 7.4594 17000 0.5946 0.0 0.0 0.0 0.9075
0.3641 7.6788 17500 0.6069 0.0 0.0 0.0 0.9075
0.3866 7.8982 18000 0.6249 0.0 0.0 0.0 0.9075

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
9
Safetensors
Model size
354M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for arielcerdap/disfluency_roberta_large

Finetuned
(282)
this model