Edit model card

bertlawbr

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0495

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10000
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
6.1291 0.22 2500 5.9888
4.8604 0.44 5000 4.4841
3.3321 0.66 7500 3.1190
2.7579 0.87 10000 2.6089
2.4135 1.09 12500 2.3029
2.2136 1.31 15000 2.1244
2.0735 1.53 17500 1.9931
1.9684 1.75 20000 1.8878
1.891 1.97 22500 1.8077
1.8215 2.18 25000 1.7487
1.7577 2.4 27500 1.6875
1.7113 2.62 30000 1.6444
1.6776 2.84 32500 1.6036
1.6203 3.06 35000 1.5608
1.6018 3.28 37500 1.5293
1.5602 3.5 40000 1.5044
1.5429 3.71 42500 1.4753
1.5148 3.93 45000 1.4472
1.4786 4.15 47500 1.4302
1.4653 4.37 50000 1.4128
1.4496 4.59 52500 1.3991
1.4445 4.81 55000 1.3943
1.5114 5.02 57500 1.4551
1.5054 5.24 60000 1.4525
1.4817 5.46 62500 1.4259
1.48 5.68 65000 1.4077
1.4526 5.9 67500 1.3912
1.4272 6.12 70000 1.3726
1.4078 6.34 72500 1.3596
1.399 6.55 75000 1.3450
1.386 6.77 77500 1.3328
1.3704 6.99 80000 1.3192
1.3538 7.21 82500 1.3131
1.3468 7.43 85000 1.2916
1.323 7.65 87500 1.2871
1.322 7.86 90000 1.2622
1.2956 8.08 92500 1.2624
1.2869 8.3 95000 1.2547
1.2763 8.52 97500 1.2404
1.275 8.74 100000 1.2305
1.2709 8.96 102500 1.2301
1.2514 9.18 105000 1.2179
1.2563 9.39 107500 1.2134
1.2487 9.61 110000 1.2111
1.2337 9.83 112500 1.2041
1.3215 10.05 115000 1.2879
1.3364 10.27 117500 1.2850
1.3286 10.49 120000 1.2779
1.3202 10.7 122500 1.2730
1.3181 10.92 125000 1.2651
1.2952 11.14 127500 1.2544
1.2889 11.36 130000 1.2506
1.2747 11.58 132500 1.2339
1.2729 11.8 135000 1.2277
1.2699 12.02 137500 1.2201
1.2508 12.23 140000 1.2163
1.2438 12.45 142500 1.2091
1.2445 12.67 145000 1.2003
1.2314 12.89 147500 1.1957
1.2188 13.11 150000 1.1843
1.2071 13.33 152500 1.1805
1.2123 13.54 155000 1.1766
1.2016 13.76 157500 1.1661
1.2079 13.98 160000 1.1625
1.1884 14.2 162500 1.1525
1.177 14.42 165000 1.1419
1.1793 14.64 167500 1.1454
1.173 14.85 170000 1.1379
1.1502 15.07 172500 1.1371
1.1504 15.29 175000 1.1295
1.146 15.51 177500 1.1203
1.1487 15.73 180000 1.1137
1.1329 15.95 182500 1.1196
1.1259 16.17 185000 1.1075
1.1287 16.38 187500 1.1037
1.126 16.6 190000 1.1042
1.1199 16.82 192500 1.0953
1.1072 17.04 195000 1.0885
1.1043 17.26 197500 1.0877
1.1007 17.48 200000 1.0835
1.0879 17.69 202500 1.0819
1.1 17.91 205000 1.0744
1.0863 18.13 207500 1.0774
1.087 18.35 210000 1.0759
1.0755 18.57 212500 1.0618
1.0832 18.79 215000 1.0628
1.0771 19.01 217500 1.0611
1.0703 19.22 220000 1.0555
1.069 19.44 222500 1.0552
1.0706 19.66 225000 1.0509
1.0633 19.88 227500 1.0465

Framework versions

  • Transformers 4.12.5
  • Pytorch 1.10.1+cu113
  • Datasets 1.17.0
  • Tokenizers 0.10.3

Citing & Authors

If you use our work, please cite:

@incollection{Viegas_2023,
    doi = {10.1007/978-3-031-36805-9_24},
    url = {https://doi.org/10.1007%2F978-3-031-36805-9_24},
    year = 2023,
    publisher = {Springer Nature Switzerland},
    pages = {349--365},
    author = {Charles F. O. Viegas and Bruno C. Costa and Renato P. Ishii},
    title = {{JurisBERT}: A New Approach that~Converts a~Classification Corpus into~an~{STS} One},
    booktitle = {Computational Science and Its Applications {\textendash} {ICCSA} 2023}
}
Downloads last month
146