MauriceV2021
/

AuroraSDGsModel

Text Classification

Model card Files Files and versions Community

AuroraSDGsModel / README.md

MauriceV2021's picture

Update README.md

a3d83fd over 1 year ago

|

3.35 kB

	---
	license: cc-by-4.0
	language:
	- en
	- nl
	- de
	- fr
	- it
	- is
	- cs
	- da
	- es
	- ca
	metrics:
	- accuracy
	- matthews_correlation
	pipeline_tag: text-classification
	---
	# Aurora SDG Multi-Label Multi-Class Model

	<!-- Provide a quick summary of what the model is/does. -->
	This model is able to classify texts related to United Nations sustainable development goals (SDG) in multiple languages.

	![image](https://user-images.githubusercontent.com/73560591/216751462-ced482ba-5d8e-48aa-9a48-5557979a35f1.png)
	Source: https://sdgs.un.org/goals

	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->

	This text classification model was developed by fine-tuning the bert-base-uncased pre-trained model. The training data for this fine-tuned model was sourced from the publicly available OSDG Community Dataset (OSDG-CD) at https://zenodo.org/record/5550238#.ZBulfcJByF4.
	This model was made as part of academic research at Deakin University. The goal was to make a transformer-based SDG text classification model that anyone could use. Only the first 16 UN SDGs supported. The primary model details are highlighted below:

	- Model type: Text classification
	- Language(s) (NLP): English, Dutch, German, Icelandic, French, Czeck, Italian, Danisch, Spanish, Catalan
	- License: cc-by-4.0
	- Finetuned from model [optional]: bert-base-multilingual-uncased

	### Model Sources
	<!-- Provide the basic links for the model. -->
	- Repository: option 1: https://huggingface.co/MauriceV2021/AuroraSDGsModel ; option 2 https://doi.org/10.5281/zenodo.7304546
	- Demo [optional]: option 1: ; option 2: https://aurora-universities.eu/sdg-research/classify/


	### Direct Use

	<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

	This is a fine-tuned model and therefore requires no further training.


	## How to Get Started with the Model

	Use the code here to get started with the model: https://github.com/Aurora-Network-Global/sdgs_many_berts


	## Training Data

	<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
	The training data includes text from 1.4 titles and abstracts of academic research papers, labeled with SDG Goals and Targets, according to an initial validated query.

	See training data here: https://doi.org/10.5281/zenodo.5205672

	### Evaluation of the Training data

	- Avg_precision = 0.70
	- Avg_recall = 0.15

	Data evaluated by 244 domain expert senior researchers.

	See evaluation report on the training data here: https://doi.org/10.5281/zenodo.4917107


	## Training Hyperparameters

	<!--
	- Num_epoch = 3
	- Learning rate = 5e-5
	- Batch size = 16
	-->

	## Evaluation

	#### Metrics

	<!-- These are the evaluation metrics being used, ideally with a description of why. -->
	- Accuracy = 0.9
	- Matthews correlation = 0.89

	See evaluation report on the model here: https://doi.org/10.5281/zenodo.5603019

	## Citation
	Sadick, A.M. (2023). SDG classification with BERT. https://huggingface.co/sadickam/sdg-classification-bert

	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->


	<!--## Model Card Contact -->