Model Card for Model ID

Sustainable-Finance-BERT is a fine-tuned BERT model for classifying text documents into categories of sustainable finance and non-sustainable finance. It assigns labels to input text, indicating whether the content aligns with sustainable finance standards (label_0) or non-sustainable finance standards (label_1).

Model Details

Architecture: BERT (Bidirectional Encoder Representations from Transformers)
Training Approach: Fine-tuning on top of the pre-trained BERT model using a binary classification objective.
Pre-trained Model: The model was initialized with weights from a pre-trained BERT model: 'bert-base-uncased'.
Fine-tuning Data: The model was fine-tuned on a dataset of 14,000 text samples from sustainable finance standards and non-sustainable finance standards.
Fine-tuning Objective: Binary classification, with label_0 indicating sustainable finance and label_1 indicating non-sustainable finance.
Tokenization: Utilized BERT's tokenization scheme, which breaks down input text into subword tokens and converts them into numerical representations suitable for model input.
Optimizer: Adam optimizer with a learning rate of 2e-5.
Loss Function: Cross-entropy loss was employed as the optimization criterion during training.
Training Duration: The duration of training may vary depending on the size of the dataset, hardware resources, and convergence criteria.
Hyperparameters: Parameters such as batch size:16, learning rate:2e-5, and number of training epochs:4 were tuned during the fine-tuning process to optimize model performance.

Model Description

This model is capable of analyzing textual content and assigning labels indicating whether the material aligns with sustainable finance standards (label_0) or non-sustainable finance standards (label_1).

Developed by: Pelumioluwa Abiola
Model type: Fine-tuned BertForSequenceClassification for text classification
Language(s) (NLP): Python, utilizing Hugging Face's Transformers library
Finetuned from model [optional]: Pre-trained BERT model - BertForSequenceClassification

This model offers a powerful tool for automatically categorizing finance-related documents, aiding financial institutions, researchers, policymakers, and other stakeholders in identifying content relevant to sustainable finance initiatives. It can facilitate decision-making processes, risk assessment, and compliance monitoring in the finance sector.

Model Sources [optional]

For additional information and resources related to the model, please refer to the following links:

Repository: Sustainable_Finance_Analyzer GitHub Repository
Guidance: This model was guided by Chris McCormick's series on BERT, available here.

These resources above provide valuable insights into the development, usage, and fine-tuning of the Sustainable-Finance-BERT model. Additionally, the GitHub repository contains data cleaning and usage guidance for the model, facilitating its implementation and integration into various applications.

Uses

The Sustainable-Finance-BERT is for automated classification of text documents into categories of sustainable finance and non-sustainable finance. It serves various purposes and can be directly utilized in several contexts:

Direct Use

Financial Institutions:

Risk Assessment: Financial institutions can use the model to assess the sustainability of their investment portfolios by classifying documents related to financial products, companies, or projects.
Compliance Monitoring: It aids in compliance monitoring with sustainable finance regulations and standards by automatically categorizing documents according to sustainability criteria.

Researchers:

Trend Analysis: Researchers can analyze trends and developments in sustainable finance by classifying large volumes of textual data, such as news articles, research papers, and policy documents.
Identifying Best Practices: The model helps identify best practices and emerging themes in sustainable finance initiatives by categorizing relevant literature and reports.

Policymakers:

Policy Evaluation: Policymakers can evaluate the effectiveness of sustainable finance policies and initiatives by categorizing documents discussing their implementation and impact.
Policy Formulation: It assists in formulating new policies and regulations related to sustainable finance by analyzing textual data on industry standards.

Environmental, Social, and Governance (ESG) Analysts:

ESG Integration: ESG analysts can integrate the model into their workflow to quickly screen companies and investment opportunities based on their alignment with sustainable finance principles.
Performance Evaluation: It facilitates the evaluation of companies' ESG performance by classifying sustainability reports, disclosures, and corporate communications.

Educational Institutions:

Curriculum Development: Educational institutions can use the model to develop curriculum materials on sustainable finance topics by categorizing relevant literature and case studies.
Student Projects: Students can utilize the model for research projects and assignments focusing on sustainable finance trends, policies, and practices.

Foreseeable Users

Financial Analysts: Professionals involved in financial analysis, investment management, and risk assessment.
Sustainability Specialists: Individuals working in sustainability consulting, corporate sustainability, and environmental advocacy.
Policy Analysts: Experts involved in policy research, advocacy, and government advisory roles.
Data Scientists and Machine Learning Engineers: Professionals working in the development and deployment of natural language processing (NLP) models.
Academic Researchers: Scholars conducting research in finance, economics, sustainability, and related fields.

The Sustainable-Finance-BERT has broad applicability across various sectors, providing valuable insights and facilitating informed decision-making in the realm of sustainable finance.

Downstream Use [optional]

The Sustainable-Finance-BERT can be further fine-tuned for specific tasks or integrated into larger ecosystems and applications to serve diverse purposes. Below are potential downstream uses of the model:

Fine-tune the model to align with specific regulatory frameworks and sustainability standards relevant to different jurisdictions or industry sectors.
Analyze trends and patterns in sustainable finance discourse by applying the model to large-scale textual datasets, identifying emerging topics, key influencers, and evolving narratives.
Fine-tune the model further based on specific criteria or preferences of investors, allowing for personalized recommendations and portfolio customization.

Out-of-Scope Use

While the model excels in classifying text documents into categories of sustainable finance and non-sustainable finance, there are certain uses that fall out of its scope or may not yield optimal results:

Sentiment Analysis: The model is not specifically designed for sentiment analysis tasks and may not accurately capture sentiment nuances in text related to sustainable finance.
Topic Modeling: While the model can identify documents relevant to sustainable finance, it may not be suitable for topic modeling tasks requiring finer granularity in identifying specific themes or topics within the domain.
Legal Compliance: The model should not be solely relied upon for legal compliance purposes, as it may not capture all regulatory nuances or legal requirements relevant to sustainable finance.
Highly Specialized Domains: Use of the model in highly specialized domains outside the scope of sustainable finance may yield suboptimal results, as it is specifically trained on data from this domain.

It's important to consider the model's limitations and ensure that its use aligns with its intended scope and capabilities to achieve the best outcomes.

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]