Edit model card

RoBERTa-legal-de-cased_German_legal_SQuAD_100

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3939

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 160
  • eval_batch_size: 40
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 2 6.2702
No log 2.0 4 6.2057
No log 3.0 6 6.0590
No log 4.0 8 5.8979
No log 5.0 10 5.4639
No log 6.0 12 5.3637
No log 7.0 14 5.1792
No log 8.0 16 4.9258
No log 9.0 18 4.7628
No log 10.0 20 4.5534
No log 11.0 22 4.3370
No log 12.0 24 4.1347
No log 13.0 26 3.9543
No log 14.0 28 3.7819
No log 15.0 30 3.6555
No log 16.0 32 3.5673
No log 17.0 34 3.4768
No log 18.0 36 3.3835
No log 19.0 38 3.3112
No log 20.0 40 3.2279
No log 21.0 42 3.1581
No log 22.0 44 3.0989
No log 23.0 46 3.0178
No log 24.0 48 2.9702
No log 25.0 50 2.9084
No log 26.0 52 2.8226
No log 27.0 54 2.8405
No log 28.0 56 2.8029
No log 29.0 58 2.6979
No log 30.0 60 2.7140
No log 31.0 62 2.6985
No log 32.0 64 2.6223
No log 33.0 66 2.6349
No log 34.0 68 2.5541
No log 35.0 70 2.4758
No log 36.0 72 2.4601
No log 37.0 74 2.4836
No log 38.0 76 2.3613
No log 39.0 78 2.2917
No log 40.0 80 2.3154
No log 41.0 82 2.2682
No log 42.0 84 2.2784
No log 43.0 86 2.2534
No log 44.0 88 2.1457
No log 45.0 90 2.1808
No log 46.0 92 2.2528
No log 47.0 94 2.1585
No log 48.0 96 2.0309
No log 49.0 98 2.0622
No log 50.0 100 2.0533
No log 51.0 102 1.9610
No log 52.0 104 1.9597
No log 53.0 106 1.8926
No log 54.0 108 1.8149
No log 55.0 110 1.7849
No log 56.0 112 1.8135
No log 57.0 114 1.8190
No log 58.0 116 1.8126
No log 59.0 118 1.8007
No log 60.0 120 1.7200
No log 61.0 122 1.6408
No log 62.0 124 1.6524
No log 63.0 126 1.6697
No log 64.0 128 1.6660
No log 65.0 130 1.5907
No log 66.0 132 1.5765
No log 67.0 134 1.5575
No log 68.0 136 1.5455
No log 69.0 138 1.5267
No log 70.0 140 1.4875
No log 71.0 142 1.4474
No log 72.0 144 1.4436
No log 73.0 146 1.4609
No log 74.0 148 1.4983
No log 75.0 150 1.4903
No log 76.0 152 1.4506
No log 77.0 154 1.3982
No log 78.0 156 1.3735
No log 79.0 158 1.3670
No log 80.0 160 1.3977
No log 81.0 162 1.4478
No log 82.0 164 1.4565
No log 83.0 166 1.4186
No log 84.0 168 1.3839
No log 85.0 170 1.3633
No log 86.0 172 1.3686
No log 87.0 174 1.3873
No log 88.0 176 1.3998
No log 89.0 178 1.4084
No log 90.0 180 1.4076
No log 91.0 182 1.3899
No log 92.0 184 1.3820
No log 93.0 186 1.3821
No log 94.0 188 1.3837
No log 95.0 190 1.3902
No log 96.0 192 1.3930
No log 97.0 194 1.3938
No log 98.0 196 1.3954
No log 99.0 198 1.3950
No log 100.0 200 1.3939

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.7
  • Tokenizers 0.15.0
Downloads last month
2
Safetensors
Model size
124M params
Tensor type
F32