sdgBERT / README.md
sadickam's picture
Update README.md
0acf2e4
|
raw
history blame
2.81 kB
metadata
license: mit
language:
  - en
metrics:
  - accuracy
  - matthews_correlation

sadickam/sdg-classification-bert

This model is for classifying text with respect to the United Nations sustainable development goals (SDG).

Model Details

Model Description

This text classification model was developed by fine-tuning the bert-base-uncased pre-trained model. The training data for this fine-tuned model was sourced from the publicly available OSDG Community Dataset (OSDG-CD) at https://zenodo.org/record/5550238#.ZBulfcJByF4. This model was made as part of academic research at Deakin University. The goal was to make a transformer-based SDG text classification model that anyone could use. Only the first 16 UN SDGs supported. The primary model details are highlighted below:

  • Model type: Text classification
  • Language(s) (NLP): English
  • License: mit
  • Finetuned from model [optional]: bert-base-uncased

Model Sources [optional]

Direct Use

This is a fine-tuned model and therefore requires no further fine-tuning.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("sadickam/sdg-classification-bert")

model = AutoModelForSequenceClassification.from_pretrained("sadickam/sdg-classification-bert")

Training Data

The training data includes text from a wide range of industries and academic research fields. Hence, this fine-tuned model is not for a specific industry.

See training here: https://zenodo.org/record/5550238#.ZBulfcJByF4

Training Hyperparameters

  • Num_epoch = 3
  • Learning rate = 5e-5
  • Batch size = 16

Evaluation

Metrics

  • Accuracy = 0.9
  • Matthews correlation = 0.89