Edit model card

RoBERTa-legal-de-cased_German_legal_SQuAD_part_augmented_100

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4416

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 128
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 4 6.3293
No log 2.0 8 5.8283
No log 3.0 12 5.4334
No log 4.0 16 5.2429
No log 5.0 20 4.8669
No log 6.0 24 4.4701
No log 7.0 28 4.2297
No log 8.0 32 4.0554
No log 9.0 36 3.8815
No log 10.0 40 3.5843
No log 11.0 44 3.4729
No log 12.0 48 3.3379
No log 13.0 52 3.2303
No log 14.0 56 3.1452
No log 15.0 60 3.1223
No log 16.0 64 3.0092
No log 17.0 68 2.9536
No log 18.0 72 2.8647
No log 19.0 76 2.8753
No log 20.0 80 2.7204
No log 21.0 84 2.7557
No log 22.0 88 2.6483
No log 23.0 92 2.5275
No log 24.0 96 2.4505
No log 25.0 100 2.4847
No log 26.0 104 2.3516
No log 27.0 108 2.2191
No log 28.0 112 2.1020
No log 29.0 116 2.1386
No log 30.0 120 2.1153
No log 31.0 124 2.0989
No log 32.0 128 1.8817
No log 33.0 132 2.0152
No log 34.0 136 1.8738
No log 35.0 140 1.9045
No log 36.0 144 1.8466
No log 37.0 148 1.7499
No log 38.0 152 1.8594
No log 39.0 156 1.7723
No log 40.0 160 1.8203
No log 41.0 164 1.7684
No log 42.0 168 1.5812
No log 43.0 172 1.7550
No log 44.0 176 1.6747
No log 45.0 180 1.6487
No log 46.0 184 1.6728
No log 47.0 188 1.6955
No log 48.0 192 1.6211
No log 49.0 196 1.6070
No log 50.0 200 1.6091
No log 51.0 204 1.5952
No log 52.0 208 1.4647
No log 53.0 212 1.4744
No log 54.0 216 1.5051
No log 55.0 220 1.6146
No log 56.0 224 1.5492
No log 57.0 228 1.5286
No log 58.0 232 1.4871
No log 59.0 236 1.5580
No log 60.0 240 1.5212
No log 61.0 244 1.5157
No log 62.0 248 1.5376
No log 63.0 252 1.4648
No log 64.0 256 1.4697
No log 65.0 260 1.5025
No log 66.0 264 1.4722
No log 67.0 268 1.4684
No log 68.0 272 1.5077
No log 69.0 276 1.4737
No log 70.0 280 1.4310
No log 71.0 284 1.4161
No log 72.0 288 1.4315
No log 73.0 292 1.4474
No log 74.0 296 1.4604
No log 75.0 300 1.4644
No log 76.0 304 1.4635
No log 77.0 308 1.4333
No log 78.0 312 1.4232
No log 79.0 316 1.4252
No log 80.0 320 1.3964
No log 81.0 324 1.4254
No log 82.0 328 1.4752
No log 83.0 332 1.4613
No log 84.0 336 1.4674
No log 85.0 340 1.4754
No log 86.0 344 1.4524
No log 87.0 348 1.4367
No log 88.0 352 1.4257
No log 89.0 356 1.4236
No log 90.0 360 1.4267
No log 91.0 364 1.4198
No log 92.0 368 1.4161
No log 93.0 372 1.4145
No log 94.0 376 1.4210
No log 95.0 380 1.4262
No log 96.0 384 1.4376
No log 97.0 388 1.4432
No log 98.0 392 1.4451
No log 99.0 396 1.4436
No log 100.0 400 1.4416

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.7
  • Tokenizers 0.15.0
Downloads last month
0
Safetensors
Model size
124M params
Tensor type
F32
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.