metadata

license: apache-2.0
tags:
  - generated_from_trainer
  - financial
  - stocks
  - sentiment
  - sentiment-analysis
  - financial-news
widget:
  - text: >-
      The company's quarterly earnings surpassed all estimates, indicating
      strong growth.
datasets:
  - financial_phrasebank
metrics:
  - accuracy
model-index:
  - name: AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis
    results:
      - task:
          name: Text Classification
          type: text-classification
        dataset:
          name: financial_phrasebank
          type: financial_phrasebank
          args: sentences_allagree
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.96688
language:
  - en
base_model:
  - distilbert/distilbert-base-uncased-finetuned-sst-2-english
pipeline_tag: text-classification
library_name: transformers

DistilBERT Fine-Tuned for Financial Sentiment Analysis

Model Description

This model is a fine-tuned version of distilbert-base-uncased specifically tailored for sentiment analysis in the financial domain. It has been trained on the Financial PhraseBank dataset to classify financial texts into three sentiment categories:

Negative (label 0)
Neutral (label 1)
Positive (label 2)

Model Performance

The model was trained for 5 epochs and evaluated on a held-out test set constituting 20 of the dataset.

Evaluation Metrics

Epoch	Eval Loss	Eval Accuracy
1	0.2210	94.26%
2	0.1997	95.81%
3	0.1719	96.69%
4	0.2073	96.03%
5	0.1941	96.69%

Training Metrics

Final Training Loss: 0.0797
Total Training Time: Approximately 3869 seconds (~1.07 hours)
Training Samples per Second: 2.34
Training Steps per Second: 0.147

Training Procedure

Data

Dataset: Financial PhraseBank
Configuration: sentences_allagree (sentences where all annotators agreed on the sentiment)
Dataset Size: 2264 sentences
Data Split: 80% training (1811 samples), 20% testing (453 samples)

Model Configuration

Base Model: distilbert-base-uncased
Number of Labels: 3 (negative, neutral, positive)
Tokenizer: Same as the base model's tokenizer

Hyperparameters

Number of Epochs: 5
Batch Size: 16 (training), 64 (evaluation)
Learning Rate: 5e-5
Optimizer: AdamW
Evaluation Metric: Accuracy
Seed: 42 (for reproducibility)

Usage

You can load and use the model with the Hugging Face transformers library as follows:

import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis")
model = AutoModelForSequenceClassification.from_pretrained("AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis")

text = "The company's revenue declined significantly due to market competition."
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

logits = outputs.logits
predicted_class_id = logits.argmax().item()

label_mapping = {0: "Negative", 1: "Neutral", 2: "Positive"}
predicted_label = label_mapping[predicted_class_id]

print(f"Text: {text}")
print(f"Predicted Sentiment: {predicted_label}")

License

This model is licensed under the Apache 2.0 License. You are free to use, modify, and distribute this model in your applications.

Citation

If you use this model in your research or applications, please cite it as:

@misc{AnkitAI_2024_financial_sentiment_model,
  title={DistilBERT Fine-Tuned for Financial Sentiment Analysis},
  author={Ankit Aglawe},
  year={2024},
  howpublished={\url{https://huggingface.co/AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis}},
}

Acknowledgments

Hugging Face: For providing the Transformers library and model hosting.
Data Providers: Thanks to the creators of the Financial PhraseBank dataset.
Community: Appreciation to the open-source community for continual support and contributions.

Contact Information

For questions, feedback, or collaboration opportunities, please contact:

Name: Ankit Aglawe
Email: [aglawe.ankit@gmail.com]
GitHub: GitHub Profile
LinkedIn: LinkedIn Profile