Edit model card

GQA_BERT_German_legal_SQuAD_17

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7586

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 160
  • eval_batch_size: 40
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 17

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 2 5.4404
No log 2.0 4 4.4407
No log 3.0 6 3.9783
No log 4.0 8 3.6009
No log 5.0 10 3.2873
No log 6.0 12 3.0050
No log 7.0 14 2.7571
No log 8.0 16 2.5398
No log 9.0 18 2.3554
No log 10.0 20 2.2110
No log 11.0 22 2.0977
No log 12.0 24 2.0078
No log 13.0 26 1.9261
No log 14.0 28 1.8590
No log 15.0 30 1.8072
No log 16.0 32 1.7733
No log 17.0 34 1.7586

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.7
  • Tokenizers 0.15.0
Downloads last month
9
Safetensors
Model size
108M params
Tensor type
F32