distilbert_ORO_Branch

This model is a fine-tuned version of distilbert-base-uncased to classify an article's text (title + abstract + keywords). The intention is that this model will be used AFTER establishing an article's relevance to an Ocean-related option (ORO) (see the screening model card on huggingface). This model will then classify a relevant article futher into the type of ORO: Mitigation, Natural resilence or Societal adaptation.

It achieves the following results on the evaluation set:

Train Loss: 0.1604
Train Binary Accuracy: 0.9509
Validation Loss: 0.2817
Validation Binary Accuracy: 0.8946
Epoch: 2

Model description

This model predicts for relevance to three labels specifying the type of ocean related option as a value between 0 and 1. Therefore a number > 0.5 indicates it is more likely to be relevant that that type of ORO.

Intended uses & limitations

This model is intended to be applied to article text (title + abstract + keywords) retrieved from citation indexed databases such as Web of Science or Scopus using a search query. This can be used to autonomously classify relevant articles from a large volume of literature and can be used in analyses that provide a granular map of the distribution of relevant studies.

Training and evaluation data

For a description of the dataset, see the paper (in prep.)

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamW', 'learning_rate': 1e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay': 0.0, 'exclude_from_weight_decay': None}
training_precision: float32

Training results

Train Loss	Train Binary Accuracy	Validation Loss	Validation Binary Accuracy	Epoch
0.4863	0.7650	0.4002	0.8368	0
0.2445	0.9238	0.2649	0.8993	1
0.1604	0.9509	0.2817	0.8946	2

Framework versions

Transformers 4.30.2
TensorFlow 2.12.0
Datasets 2.18.0
Tokenizers 0.13.3

dveytia
/

distilbert_ORO_Branch