Edit model card

distilroberta-current

This model classifies articles as current (covering or discussing current events) or not current (not relating to current events).

The model is a fine-tuned version of distilroberta-base on a dataset of articles labeled using weak-supervision and manual labeling

It achieves the following results on the evaluation set:

  • Loss: 0.1745
  • Acc: 0.9355

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 12345
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 16
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Acc
No log 1.0 11 0.6559 0.7097
0.6762 2.0 22 0.5627 0.7097
0.5432 3.0 33 0.4606 0.7097
0.5432 4.0 44 0.3651 0.8065
0.411 5.0 55 0.2512 0.9194
0.269 6.0 66 0.2774 0.9355
0.269 7.0 77 0.2062 0.8710
0.2294 8.0 88 0.2598 0.9355
0.1761 9.0 99 0.1745 0.9355

Framework versions

  • Transformers 4.11.3
  • Pytorch 1.10.1
  • Datasets 1.17.0
  • Tokenizers 0.10.3
Downloads last month
870