AnkitAI's picture
Update README.md
cc1c66e verified
---
license: apache-2.0
tags:
- generated_from_trainer
- financial
- stocks
- sentiment
- sentiment-analysis
- financial-news
widget:
- text: The company's quarterly earnings surpassed all estimates, indicating strong growth.
datasets:
- financial_phrasebank
metrics:
- accuracy
model-index:
- name: AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: financial_phrasebank
type: financial_phrasebank
args: sentences_allagree
metrics:
- name: Accuracy
type: accuracy
value: 0.96688
language:
- en
base_model:
- distilbert/distilbert-base-uncased-finetuned-sst-2-english
pipeline_tag: text-classification
library_name: transformers
---
# DistilBERT Fine-Tuned for Financial Sentiment Analysis
## Model Description
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) specifically tailored for sentiment analysis in the financial domain. It has been trained on the [Financial PhraseBank](https://huggingface.co/datasets/financial_phrasebank) dataset to classify financial texts into three sentiment categories:
- Negative (label `0`)
- Neutral (label `1`)
- Positive (label `2`)
## Model Performance
The model was trained for 5 epochs and evaluated on a held-out test set constituting 20 of the dataset.
### Evaluation Metrics
| Epoch | Eval Loss | Eval Accuracy |
|-----------|---------------|-------------------|
| 1 | 0.2210 | 94.26% |
| 2 | 0.1997 | 95.81% |
| 3 | 0.1719 | 96.69% |
| 4 | 0.2073 | 96.03% |
| 5 | 0.1941 | **96.69%** |
### Training Metrics
- **Final Training Loss**: 0.0797
- **Total Training Time**: Approximately 3869 seconds (~1.07 hours)
- **Training Samples per Second**: 2.34
- **Training Steps per Second**: 0.147
## Training Procedure
### Data
- **Dataset**: [Financial PhraseBank](https://huggingface.co/datasets/financial_phrasebank)
- **Configuration**: `sentences_allagree` (sentences where all annotators agreed on the sentiment)
- **Dataset Size**: 2264 sentences
- **Data Split**: 80% training (1811 samples), 20% testing (453 samples)
### Model Configuration
- **Base Model**: [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)
- **Number of Labels**: 3 (negative, neutral, positive)
- **Tokenizer**: Same as the base model's tokenizer
### Hyperparameters
- **Number of Epochs**: 5
- **Batch Size**: 16 (training), 64 (evaluation)
- **Learning Rate**: 5e-5
- **Optimizer**: AdamW
- **Evaluation Metric**: Accuracy
- **Seed**: 42 (for reproducibility)
## Usage
You can load and use the model with the Hugging Face `transformers` library as follows:
```python
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis")
model = AutoModelForSequenceClassification.from_pretrained("AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis")
text = "The company's revenue declined significantly due to market competition."
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
predicted_class_id = logits.argmax().item()
label_mapping = {0: "Negative", 1: "Neutral", 2: "Positive"}
predicted_label = label_mapping[predicted_class_id]
print(f"Text: {text}")
print(f"Predicted Sentiment: {predicted_label}")
```
## License
This model is licensed under the **Apache 2.0 License**. You are free to use, modify, and distribute this model in your applications.
## Citation
If you use this model in your research or applications, please cite it as:
```
@misc{AnkitAI_2024_financial_sentiment_model,
title={DistilBERT Fine-Tuned for Financial Sentiment Analysis},
author={Ankit Aglawe},
year={2024},
howpublished={\url{https://huggingface.co/AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis}},
}
```
## Acknowledgments
- **Hugging Face**: For providing the Transformers library and model hosting.
- **Data Providers**: Thanks to the creators of the Financial PhraseBank dataset.
- **Community**: Appreciation to the open-source community for continual support and contributions.
## Contact Information
For questions, feedback, or collaboration opportunities, please contact:
- **Name**: Ankit Aglawe
- **Email**: [aglawe.ankit@gmail.com]
- **GitHub**: [GitHub Profile](https://github.com/ankit-aglawe)
- **LinkedIn**: [LinkedIn Profile](https://www.linkedin.com/in/ankit-aglawe)