Edit model card

BERT-legal-de-cased_German_legal_SQuAD_complete_augmented_17

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.7018

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 160
  • eval_batch_size: 40
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 17

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 3 6.2400
No log 2.0 6 6.2810
No log 3.0 9 5.7072
No log 4.0 12 5.4438
No log 5.0 15 5.2361
No log 6.0 18 4.9645
No log 7.0 21 4.7359
No log 8.0 24 4.5076
No log 9.0 27 4.3330
No log 10.0 30 4.1625
No log 11.0 33 4.0412
No log 12.0 36 3.9448
No log 13.0 39 3.8519
No log 14.0 42 3.7690
No log 15.0 45 3.7332
No log 16.0 48 3.7108
No log 17.0 51 3.7018

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.7
  • Tokenizers 0.15.0
Downloads last month
9
Safetensors
Model size
108M params
Tensor type
F32