María Navas Loro
Update README.md
cccbbe8
|
raw
history blame
3.15 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - f1
  - accuracy
model-index:
  - name: roberta-finetuned-CPV_Spanish
    results: []

roberta-finetuned-CPV_Spanish

This model is a fine-tuned version of PlanTL-GOB-ES/roberta-base-bne on a dataset derived from Spanish Public Procurement documents from 2019. The whole fine-tuning process is available in the following Kaggle notebook.

It achieves the following results on the evaluation set:

  • Loss: 0.0465
  • F1: 0.7918
  • Roc Auc: 0.8860
  • Accuracy: 0.7376
  • Coverage Error: 10.2744
  • Label Ranking Average Precision Score: 0.7973

Intended uses & limitations

This model only predicts the first two digits of the CPV codes.

Training and evaluation data

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss F1 Roc Auc Accuracy Coverage Error Label Ranking Average Precision Score
0.0354 1.0 9054 0.0362 0.7560 0.8375 0.6963 14.0835 0.7357
0.0311 2.0 18108 0.0331 0.7756 0.8535 0.7207 12.7880 0.7633
0.0235 3.0 27162 0.0333 0.7823 0.8705 0.7283 11.5179 0.7811
0.0157 4.0 36216 0.0348 0.7821 0.8699 0.7274 11.5836 0.7798
0.011 5.0 45270 0.0377 0.7799 0.8787 0.7239 10.9173 0.7841
0.008 6.0 54324 0.0395 0.7854 0.8787 0.7309 10.9042 0.7879
0.0042 7.0 63378 0.0421 0.7872 0.8823 0.7300 10.5687 0.7903
0.0025 8.0 72432 0.0439 0.7884 0.8867 0.7305 10.2220 0.7934
0.0015 9.0 81486 0.0456 0.7889 0.8872 0.7316 10.1781 0.7945
0.001 10.0 90540 0.0465 0.7918 0.8860 0.7376 10.2744 0.7973

Framework versions

  • Transformers 4.16.2
  • Pytorch 1.9.1
  • Datasets 1.18.4
  • Tokenizers 0.11.6