|
--- |
|
license: apache-2.0 |
|
tags: |
|
- generated_from_trainer |
|
- financial |
|
- stocks |
|
- sentiment |
|
- sentiment-analysis |
|
- financial-news |
|
widget: |
|
- text: The company's quarterly earnings surpassed all estimates, indicating strong growth. |
|
datasets: |
|
- financial_phrasebank |
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis |
|
results: |
|
- task: |
|
name: Text Classification |
|
type: text-classification |
|
dataset: |
|
name: financial_phrasebank |
|
type: financial_phrasebank |
|
args: sentences_allagree |
|
metrics: |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.96688 |
|
language: |
|
- en |
|
base_model: |
|
- distilbert/distilbert-base-uncased-finetuned-sst-2-english |
|
pipeline_tag: text-classification |
|
library_name: transformers |
|
--- |
|
# DistilBERT Fine-Tuned for Financial Sentiment Analysis |
|
## Model Description |
|
|
|
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) specifically tailored for sentiment analysis in the financial domain. It has been trained on the [Financial PhraseBank](https://huggingface.co/datasets/financial_phrasebank) dataset to classify financial texts into three sentiment categories: |
|
|
|
- Negative (label `0`) |
|
- Neutral (label `1`) |
|
- Positive (label `2`) |
|
|
|
## Model Performance |
|
The model was trained for 5 epochs and evaluated on a held-out test set constituting 20 of the dataset. |
|
|
|
### Evaluation Metrics |
|
| Epoch | Eval Loss | Eval Accuracy | |
|
|-----------|---------------|-------------------| |
|
| 1 | 0.2210 | 94.26% | |
|
| 2 | 0.1997 | 95.81% | |
|
| 3 | 0.1719 | 96.69% | |
|
| 4 | 0.2073 | 96.03% | |
|
| 5 | 0.1941 | **96.69%** | |
|
|
|
### Training Metrics |
|
- **Final Training Loss**: 0.0797 |
|
- **Total Training Time**: Approximately 3869 seconds (~1.07 hours) |
|
- **Training Samples per Second**: 2.34 |
|
- **Training Steps per Second**: 0.147 |
|
|
|
## Training Procedure |
|
### Data |
|
- **Dataset**: [Financial PhraseBank](https://huggingface.co/datasets/financial_phrasebank) |
|
- **Configuration**: `sentences_allagree` (sentences where all annotators agreed on the sentiment) |
|
- **Dataset Size**: 2264 sentences |
|
- **Data Split**: 80% training (1811 samples), 20% testing (453 samples) |
|
|
|
### Model Configuration |
|
- **Base Model**: [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) |
|
- **Number of Labels**: 3 (negative, neutral, positive) |
|
- **Tokenizer**: Same as the base model's tokenizer |
|
|
|
### Hyperparameters |
|
- **Number of Epochs**: 5 |
|
- **Batch Size**: 16 (training), 64 (evaluation) |
|
- **Learning Rate**: 5e-5 |
|
- **Optimizer**: AdamW |
|
- **Evaluation Metric**: Accuracy |
|
- **Seed**: 42 (for reproducibility) |
|
|
|
## Usage |
|
You can load and use the model with the Hugging Face `transformers` library as follows: |
|
```python |
|
import torch |
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis") |
|
model = AutoModelForSequenceClassification.from_pretrained("AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis") |
|
|
|
text = "The company's revenue declined significantly due to market competition." |
|
inputs = tokenizer(text, return_tensors="pt") |
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
|
|
logits = outputs.logits |
|
predicted_class_id = logits.argmax().item() |
|
|
|
label_mapping = {0: "Negative", 1: "Neutral", 2: "Positive"} |
|
predicted_label = label_mapping[predicted_class_id] |
|
|
|
print(f"Text: {text}") |
|
print(f"Predicted Sentiment: {predicted_label}") |
|
``` |
|
|
|
## License |
|
This model is licensed under the **Apache 2.0 License**. You are free to use, modify, and distribute this model in your applications. |
|
|
|
## Citation |
|
If you use this model in your research or applications, please cite it as: |
|
``` |
|
@misc{AnkitAI_2024_financial_sentiment_model, |
|
title={DistilBERT Fine-Tuned for Financial Sentiment Analysis}, |
|
author={Ankit Aglawe}, |
|
year={2024}, |
|
howpublished={\url{https://huggingface.co/AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis}}, |
|
} |
|
``` |
|
## Acknowledgments |
|
- **Hugging Face**: For providing the Transformers library and model hosting. |
|
- **Data Providers**: Thanks to the creators of the Financial PhraseBank dataset. |
|
- **Community**: Appreciation to the open-source community for continual support and contributions. |
|
|
|
## Contact Information |
|
For questions, feedback, or collaboration opportunities, please contact: |
|
- **Name**: Ankit Aglawe |
|
- **Email**: [aglawe.ankit@gmail.com] |
|
- **GitHub**: [GitHub Profile](https://github.com/ankit-aglawe) |
|
- **LinkedIn**: [LinkedIn Profile](https://www.linkedin.com/in/ankit-aglawe) |
|
|