María Navas Loro
Update README.md
107741e
|
raw
history blame
No virus
3.11 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - f1
  - accuracy
model-index:
  - name: roberta-finetuned-CPV_Spanish
    results: []

roberta-finetuned-CPV_Spanish

This model is a fine-tuned version of PlanTL-GOB-ES/roberta-base-bne on a dataset derived from Spanish Public Procurement documents from 2019. The whole fine-tuning process is available in the following Kaggle notebook. It achieves the following results on the evaluation set:

  • Loss: 0.0460
  • F1: 0.7937
  • Roc Auc: 0.8857
  • Accuracy: 0.7398
  • Coverage Error: 10.3171
  • Label Ranking Average Precision Score: 0.7977

Intended uses & limitations

This model only predicts the first two digits of the CPV codes.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss F1 Roc Auc Accuracy Coverage Error Label Ranking Average Precision Score
0.0359 1.0 9054 0.0368 0.7527 0.8361 0.6920 14.2585 0.7318
0.0314 2.0 18108 0.0332 0.7753 0.8518 0.7198 12.9053 0.7612
0.0235 3.0 27162 0.0332 0.7824 0.8656 0.7284 11.8961 0.7767
0.0166 4.0 36216 0.0348 0.7824 0.8725 0.7289 11.3928 0.7821
0.0114 5.0 45270 0.0371 0.7825 0.8799 0.7271 10.8051 0.7871
0.0079 6.0 54324 0.0398 0.7829 0.8765 0.7260 11.0922 0.7831
0.0042 7.0 63378 0.0414 0.7889 0.8798 0.7317 10.7793 0.7891
0.0025 8.0 72432 0.0434 0.7895 0.8847 0.7317 10.3856 0.7924
0.0014 9.0 81486 0.0451 0.7928 0.8860 0.7356 10.3086 0.7960
0.001 10.0 90540 0.0460 0.7937 0.8857 0.7398 10.3171 0.7977

Framework versions

  • Transformers 4.16.2
  • Pytorch 1.9.1
  • Datasets 1.18.4
  • Tokenizers 0.11.6