Edit model card

Legal_QA_BERT_augmented_17

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.9583

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 17

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 15 4.3302
No log 2.0 30 3.7975
No log 3.0 45 3.4189
No log 4.0 60 3.2313
No log 5.0 75 3.2055
No log 6.0 90 3.1888
No log 7.0 105 3.2470
No log 8.0 120 3.4233
No log 9.0 135 3.4465
No log 10.0 150 3.6328
No log 11.0 165 3.7262
No log 12.0 180 3.7712
No log 13.0 195 3.8782
No log 14.0 210 3.9330
No log 15.0 225 4.0671
No log 16.0 240 3.8164
No log 17.0 255 3.9583

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.7
  • Tokenizers 0.15.0
Downloads last month
3
Safetensors
Model size
108M params
Tensor type
F32