mnavas
/

roberta-finetuned-CPV_Spanish

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

roberta-finetuned-CPV_Spanish / README.md

María Navas Loro

Update README.md

284148e over 2 years ago

|

3.12 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	metrics:
	- f1
	- accuracy
	model-index:
	- name: roberta-finetuned-CPV_Spanish
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# roberta-finetuned-CPV_Spanish

	This model is a fine-tuned version of [PlanTL-GOB-ES/roberta-base-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-base-bne) on a dataset derived from Spanish Public Procurement documents from 2019. The whole fine-tuning process is available in the following [Kaggle notebook](https://www.kaggle.com/code/marianavasloro/fine-tuned-roberta-for-spanish-cpv-codes).
	It achieves the following results on the evaluation set:
	- Loss: 0.0152
	- F1: 0.9462
	- Roc Auc: 0.9698
	- Accuracy: 0.9297
	- Coverage Error: 3.6573
	- Label Ranking Average Precision Score: 0.9451

	## Intended uses & limitations

	This model only predicts the first two digits of the CPV codes.

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| F1 \| Roc Auc \| Accuracy \| Coverage Error \| Label Ranking Average Precision Score \|
	\|:-------------:\|:-----:\|:------:\|:---------------:\|:------:\|:-------:\|:--------:\|:--------------:\|:-------------------------------------:\|
	\| 0.0287 \| 1.0 \| 20385 \| 0.0270 \| 0.8235 \| 0.8815 \| 0.7695 \| 10.4603 \| 0.8167 \|
	\| 0.0172 \| 2.0 \| 40770 \| 0.0199 \| 0.8773 \| 0.9210 \| 0.8306 \| 7.5943 \| 0.8768 \|
	\| 0.01 \| 3.0 \| 61155 \| 0.0168 \| 0.9028 \| 0.9364 \| 0.8639 \| 6.2111 \| 0.9045 \|
	\| 0.0062 \| 4.0 \| 81540 \| 0.0152 \| 0.9207 \| 0.9520 \| 0.8871 \| 5.1353 \| 0.9213 \|
	\| 0.0037 \| 5.0 \| 101925 \| 0.0151 \| 0.9300 \| 0.9569 \| 0.9026 \| 4.7350 \| 0.9295 \|
	\| 0.0021 \| 6.0 \| 122310 \| 0.0147 \| 0.9365 \| 0.9625 \| 0.9123 \| 4.2946 \| 0.9355 \|
	\| 0.0013 \| 7.0 \| 142695 \| 0.0148 \| 0.9396 \| 0.9659 \| 0.9184 \| 3.9912 \| 0.9387 \|
	\| 0.001 \| 8.0 \| 163080 \| 0.0150 \| 0.9426 \| 0.9680 \| 0.9243 \| 3.8065 \| 0.9422 \|
	\| 0.0006 \| 9.0 \| 183465 \| 0.0152 \| 0.9445 \| 0.9693 \| 0.9274 \| 3.7064 \| 0.9438 \|
	\| 0.0003 \| 10.0 \| 203850 \| 0.0152 \| 0.9462 \| 0.9698 \| 0.9297 \| 3.6573 \| 0.9451 \|


	### Framework versions

	- Transformers 4.16.2
	- Pytorch 1.9.1
	- Datasets 1.18.4
	- Tokenizers 0.11.6