Edit model card

BERT-legal-de-cased_German_legal_SQuAD_complete_augmented_100

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1526

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 160
  • eval_batch_size: 40
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 3 6.1716
No log 2.0 6 6.1155
No log 3.0 9 5.9077
No log 4.0 12 5.3879
No log 5.0 15 5.0605
No log 6.0 18 4.7106
No log 7.0 21 4.3762
No log 8.0 24 4.1123
No log 9.0 27 3.9312
No log 10.0 30 3.7030
No log 11.0 33 3.6067
No log 12.0 36 3.3692
No log 13.0 39 3.2907
No log 14.0 42 3.1433
No log 15.0 45 3.0788
No log 16.0 48 2.8640
No log 17.0 51 2.7940
No log 18.0 54 2.6554
No log 19.0 57 2.5276
No log 20.0 60 2.3190
No log 21.0 63 2.3093
No log 22.0 66 2.1057
No log 23.0 69 2.0637
No log 24.0 72 1.8860
No log 25.0 75 1.9736
No log 26.0 78 1.7669
No log 27.0 81 1.8228
No log 28.0 84 1.6687
No log 29.0 87 1.6073
No log 30.0 90 1.5079
No log 31.0 93 1.5595
No log 32.0 96 1.4414
No log 33.0 99 1.5535
No log 34.0 102 1.2956
No log 35.0 105 1.3910
No log 36.0 108 1.2563
No log 37.0 111 1.3677
No log 38.0 114 1.2738
No log 39.0 117 1.2571
No log 40.0 120 1.1964
No log 41.0 123 1.2716
No log 42.0 126 1.2709
No log 43.0 129 1.2578
No log 44.0 132 1.2077
No log 45.0 135 1.1723
No log 46.0 138 1.1896
No log 47.0 141 1.2000
No log 48.0 144 1.2256
No log 49.0 147 1.1436
No log 50.0 150 1.1785
No log 51.0 153 1.1908
No log 52.0 156 1.1874
No log 53.0 159 1.1698
No log 54.0 162 1.1164
No log 55.0 165 1.2061
No log 56.0 168 1.2007
No log 57.0 171 1.1804
No log 58.0 174 1.1095
No log 59.0 177 1.1358
No log 60.0 180 1.1718
No log 61.0 183 1.1490
No log 62.0 186 1.1712
No log 63.0 189 1.1858
No log 64.0 192 1.1166
No log 65.0 195 1.1321
No log 66.0 198 1.1600
No log 67.0 201 1.1244
No log 68.0 204 1.1524
No log 69.0 207 1.1676
No log 70.0 210 1.1455
No log 71.0 213 1.1868
No log 72.0 216 1.1721
No log 73.0 219 1.1277
No log 74.0 222 1.1309
No log 75.0 225 1.1908
No log 76.0 228 1.1964
No log 77.0 231 1.1512
No log 78.0 234 1.1572
No log 79.0 237 1.2009
No log 80.0 240 1.1888
No log 81.0 243 1.1377
No log 82.0 246 1.1146
No log 83.0 249 1.1026
No log 84.0 252 1.1421
No log 85.0 255 1.1447
No log 86.0 258 1.1208
No log 87.0 261 1.1050
No log 88.0 264 1.1345
No log 89.0 267 1.1562
No log 90.0 270 1.1491
No log 91.0 273 1.1267
No log 92.0 276 1.1183
No log 93.0 279 1.1371
No log 94.0 282 1.1566
No log 95.0 285 1.1662
No log 96.0 288 1.1628
No log 97.0 291 1.1547
No log 98.0 294 1.1499
No log 99.0 297 1.1506
No log 100.0 300 1.1526

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.7
  • Tokenizers 0.15.0
Downloads last month
14
Safetensors
Model size
108M params
Tensor type
F32