Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

roberta-el-ner18

This model is a fine-tuned version of cvcio/roberta-el-news on the elNER dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1380
  • Precision: 0.9138
  • Recall: 0.9289
  • F1: 0.9213
  • Accuracy: 0.9832

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

More information needed

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
0.4245 1.87 250 0.1622 0.7727 0.8096 0.7907 0.9597
0.0798 3.73 500 0.0841 0.8587 0.9005 0.8791 0.9776
0.0487 5.6 750 0.0812 0.8850 0.9140 0.8992 0.9806
0.0222 7.46 1000 0.0855 0.9001 0.9180 0.9089 0.9819
0.0141 9.33 1250 0.0903 0.9023 0.9230 0.9125 0.9827
0.0079 11.19 1500 0.1006 0.9067 0.9258 0.9161 0.9823
0.0063 13.06 1750 0.1020 0.9049 0.9296 0.9171 0.9826
0.0039 14.93 2000 0.1097 0.9078 0.9246 0.9161 0.9820
0.004 16.79 2250 0.1119 0.9084 0.9239 0.9161 0.9825
0.0024 18.66 2500 0.1166 0.9086 0.9268 0.9177 0.9828
0.0029 20.52 2750 0.1192 0.9106 0.9260 0.9182 0.9825
0.0023 22.39 3000 0.1161 0.9085 0.9284 0.9183 0.9829
0.0022 24.25 3250 0.1238 0.9078 0.9281 0.9178 0.9825
0.0021 26.12 3500 0.1232 0.9082 0.9239 0.9160 0.9821
0.0013 27.99 3750 0.1253 0.9050 0.9296 0.9172 0.9824
0.0012 29.85 4000 0.1247 0.9075 0.9284 0.9179 0.9827
0.0014 31.72 4250 0.1263 0.9063 0.9237 0.9149 0.9823
0.0012 33.58 4500 0.1295 0.9028 0.9272 0.9148 0.9827
0.001 35.45 4750 0.1341 0.9107 0.9305 0.9205 0.9831
0.001 37.31 5000 0.1296 0.9122 0.9298 0.9209 0.9833
0.0013 39.18 5250 0.1273 0.9058 0.9249 0.9153 0.9823
0.0007 41.04 5500 0.1296 0.9053 0.9261 0.9156 0.9824
0.0007 42.91 5750 0.1326 0.9083 0.9303 0.9192 0.9830
0.0006 44.78 6000 0.1328 0.9088 0.9270 0.9178 0.9828
0.0006 46.64 6250 0.1362 0.9103 0.9314 0.9207 0.9831
0.0004 48.51 6500 0.1351 0.9132 0.9288 0.9209 0.9830
0.0005 50.37 6750 0.1325 0.9138 0.9270 0.9204 0.9830
0.0005 52.24 7000 0.1330 0.9115 0.9272 0.9193 0.9832
0.0005 54.1 7250 0.1356 0.9119 0.9270 0.9194 0.9833
0.0004 55.97 7500 0.1367 0.9132 0.9274 0.9202 0.9832
0.0003 57.84 7750 0.1380 0.9141 0.9288 0.9214 0.9832
0.0004 59.7 8000 0.1380 0.9138 0.9289 0.9213 0.9832

Eval results

Precision Recall F1 Accuracy
eval 0.9138 0.9289 0.9213 0.9832
test 0.9097 0.9232 0.9164 0.9808

Framework versions

  • Transformers 4.29.2
  • Pytorch 1.13.1+cu117
  • Datasets 2.9.0
  • Tokenizers 0.13.2

Authors

Dimitris Papaevagelou - @andefined

About Us

Civic Information Office is a Non Profit Organization based in Athens, Greece focusing on creating technology and research products for the public interest.

Downloads last month
10
Safetensors
Model size
124M params
Tensor type
I64
·
F32
·