Edit model card

GQA_BERT_German_legal_SQuAD_part_augmented_17

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8704

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 160
  • eval_batch_size: 40
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 17

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 3 5.1176
No log 2.0 6 4.6377
No log 3.0 9 4.0318
No log 4.0 12 3.7360
No log 5.0 15 3.3913
No log 6.0 18 3.0846
No log 7.0 21 2.8234
No log 8.0 24 2.6038
No log 9.0 27 2.4490
No log 10.0 30 2.2988
No log 11.0 33 2.1783
No log 12.0 36 2.0988
No log 13.0 39 2.0404
No log 14.0 42 1.9775
No log 15.0 45 1.9231
No log 16.0 48 1.8851
No log 17.0 51 1.8704

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.7
  • Tokenizers 0.15.0
Downloads last month
13
Safetensors
Model size
108M params
Tensor type
F32