--- license: cc-by-4.0 language: - en - nl - de - fr - it - is - cs - da - es - ca metrics: - accuracy - matthews_correlation pipeline_tag: text-classification library_name: keras --- # Aurora SDG Multi-Label Multi-Class Model This model is able to classify texts related to United Nations sustainable development goals (SDG) in multiple languages. ![image](https://user-images.githubusercontent.com/73560591/216751462-ced482ba-5d8e-48aa-9a48-5557979a35f1.png) Source: https://sdgs.un.org/goals ## Model Details ### Model Description This text classification model was developed by fine-tuning the bert-base-uncased pre-trained model. The training data for this fine-tuned model was sourced from the publicly available OSDG Community Dataset (OSDG-CD) at https://zenodo.org/record/5550238#.ZBulfcJByF4. This model was made as part of academic research at Deakin University. The goal was to make a transformer-based SDG text classification model that anyone could use. Only the first 16 UN SDGs supported. The primary model details are highlighted below: - **Model type:** Text classification - **Language(s) (NLP):** English, Dutch, German, Icelandic, French, Czeck, Italian, Danisch, Spanish, Catalan - **License:** cc-by-4.0 - **Finetuned from model [optional]:** bert-base-multilingual-uncased ### Model Sources - **Repository:** option 1: https://huggingface.co/MauriceV2021/AuroraSDGsModel ; option 2 https://doi.org/10.5281/zenodo.7304546 - **Demo [optional]:** option 1: https://huggingface.co/spaces/MauriceV2021/SDGclassifier ; option 2: https://aurora-universities.eu/sdg-research/classify/ ### Direct Use This is a fine-tuned model and therefore requires no further training. ## How to Get Started with the Model Use the code here to get started with the model: https://github.com/Aurora-Network-Global/sdgs_many_berts ## Training Data The training data includes text from 1.4 titles and abstracts of academic research papers, labeled with SDG Goals and Targets, according to an initial validated query. See training data here: https://doi.org/10.5281/zenodo.5205672 ### Evaluation of the Training data - Avg_precision = 0.70 - Avg_recall = 0.15 Data evaluated by 244 domain expert senior researchers. See evaluation report on the training data here: https://doi.org/10.5281/zenodo.4917107 ## Training Hyperparameters ## Evaluation #### Metrics - Accuracy = 0.9 - Matthews correlation = 0.89 See evaluation report on the model here: https://doi.org/10.5281/zenodo.5603019 ## Citation Sadick, A.M. (2023). SDG classification with BERT. https://huggingface.co/sadickam/sdg-classification-bert