metadata

license: mit
language:
  - en
metrics:
  - accuracy
  - matthews_correlation

sadickam/sdg-classification-bert

This model is for classifying text with respect to the United Nations sustainable development goals (SDG).

Model Details

Model Description

This text classification model was developed by fine-tuning the bert-base-uncased pre-trained model. The training data for this fine-tuned model was sourced from the publicly available OSDG Community Dataset (OSDG-CD) at https://zenodo.org/record/5550238#.ZBulfcJByF4. This model was made as part of academic research at Deakin University. The goal was to make a transformer-based SDG text classification model that anyone could use. Only the first 16 UN SDGs supported. The primary model details are highlighted below:

Model type: Text classification
Language(s) (NLP): English
License: mit
Finetuned from model [optional]: bert-base-uncased

Model Sources [optional]

Repository: https://github.com/sadickam/sdg-classification-bert
Demo [optional]: option 1: https://sadickam-sdg-text-classifier.hf.space/; option 2: https://sadickam-sdg-classification-bert-main-qxg1gv.streamlit.app/

Direct Use

This is a fine-tuned model and therefore requires no further fine-tuning.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("sadickam/sdg-classification-bert")

model = AutoModelForSequenceClassification.from_pretrained("sadickam/sdg-classification-bert")

Training Data

The training data includes text from a wide range of industries and academic research fields. Hence, this fine-tuned model is not for a specific industry.

See training here: https://zenodo.org/record/5550238#.ZBulfcJByF4

Training Hyperparameters

Num_epoch = 3
Learning rate = 5e-5
Batch size = 16

Evaluation

Metrics

Accuracy = 0.9
Matthews correlation = 0.89