license: mit
language:
- en
metrics:
- accuracy
- matthews_correlation
widget:
- text: >-
Highway work zones create potential risks for both traffic and workers in
addition to traffic congestion and delays that result in increased road
user delay.
- text: >-
A circular economy is a way of achieving sustainable consumption and
production, as well as nature positive outcomes.
sadickam/sdg-classification-bert
This model (sgdBERT) is for classifying text with respect to the United Nations sustainable development goals (SDG).
Source:https://www.un.org/development/desa/disabilities/about-us/sustainable-development-goals-sdgs-and-disability.html
Model Details
Model Description
This text classification model was developed by fine-tuning the bert-base-uncased pre-trained model. The training data for this fine-tuned model was sourced from the publicly available OSDG Community Dataset (OSDG-CD) at https://zenodo.org/record/5550238#.ZBulfcJByF4. This model was made as part of academic research at Deakin University. The goal was to make a transformer-based SDG text classification model that anyone could use. Only the first 16 UN SDGs supported. The primary model details are highlighted below:
- Model type: Text classification
- Language(s) (NLP): English
- License: mit
- Finetuned from model [optional]: bert-base-uncased
Model Sources
- Repository: https://huggingface.co/sadickam/sdg-classification-bert
- Demo [optional]: option 1: https://sadickam-sdg-text-classifier.hf.space/; option 2: https://sadickam-sdg-classification-bert-main-qxg1gv.streamlit.app/
Direct Use
This is a fine-tuned model and therefore requires no further training.
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("sadickam/sdg-classification-bert")
model = AutoModelForSequenceClassification.from_pretrained("sadickam/sdg-classification-bert")
Training Data
The training data includes text from a wide range of industries and academic research fields. Hence, this fine-tuned model is not for a specific industry.
See training here: https://zenodo.org/record/5550238#.ZBulfcJByF4
Training Hyperparameters
- Num_epoch = 3
- Learning rate = 5e-5
- Batch size = 16
Evaluation
Metrics
- Accuracy = 0.90
- Matthews correlation = 0.89
Citation
Sadick, A.M. (2023). SDG classification with BERT. https://huggingface.co/sadickam/sdg-classification-bert