File size: 4,722 Bytes
0bb179a b3d1bee 0bb179a b3d1bee 0bb179a b3d1bee 0bb179a b3d1bee 0bb179a b3d1bee 0bb179a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
---
license: apache-2.0
tags:
- generated_from_trainer
- financial
- stocks
- sentiment
- sentiment-analysis
- financial-news
widget:
- text: The company's quarterly earnings surpassed all estimates, indicating strong growth.
datasets:
- financial_phrasebank
metrics:
- accuracy
model-index:
- name: AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: financial_phrasebank
type: financial_phrasebank
args: sentences_allagree
metrics:
- name: Accuracy
type: accuracy
value: 0.96688
language:
- en
base_model:
- distilbert/distilbert-base-uncased-finetuned-sst-2-english
pipeline_tag: text-classification
library_name: transformers
---
# DistilBERT Fine-Tuned for Financial Sentiment Analysis
## Model Description
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) specifically tailored for sentiment analysis in the financial domain. It has been trained on the [Financial PhraseBank](https://huggingface.co/datasets/financial_phrasebank) dataset to classify financial texts into three sentiment categories:
- Negative (label `0`)
- Neutral (label `1`)
- Positive (label `2`)
## Model Performance
The model was trained for 5 epochs and evaluated on a held-out test set constituting 20 of the dataset.
### Evaluation Metrics
| Epoch | Eval Loss | Eval Accuracy |
|-----------|---------------|-------------------|
| 1 | 0.2210 | 94.26% |
| 2 | 0.1997 | 95.81% |
| 3 | 0.1719 | 96.69% |
| 4 | 0.2073 | 96.03% |
| 5 | 0.1941 | **96.69%** |
### Training Metrics
- **Final Training Loss**: 0.0797
- **Total Training Time**: Approximately 3869 seconds (~1.07 hours)
- **Training Samples per Second**: 2.34
- **Training Steps per Second**: 0.147
## Training Procedure
### Data
- **Dataset**: [Financial PhraseBank](https://huggingface.co/datasets/financial_phrasebank)
- **Configuration**: `sentences_allagree` (sentences where all annotators agreed on the sentiment)
- **Dataset Size**: 2264 sentences
- **Data Split**: 80% training (1811 samples), 20% testing (453 samples)
### Model Configuration
- **Base Model**: [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)
- **Number of Labels**: 3 (negative, neutral, positive)
- **Tokenizer**: Same as the base model's tokenizer
### Hyperparameters
- **Number of Epochs**: 5
- **Batch Size**: 16 (training), 64 (evaluation)
- **Learning Rate**: 5e-5
- **Optimizer**: AdamW
- **Evaluation Metric**: Accuracy
- **Seed**: 42 (for reproducibility)
## Usage
You can load and use the model with the Hugging Face `transformers` library as follows:
```python
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis")
model = AutoModelForSequenceClassification.from_pretrained("AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis")
text = "The company's revenue declined significantly due to market competition."
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
predicted_class_id = logits.argmax().item()
label_mapping = {0: "Negative", 1: "Neutral", 2: "Positive"}
predicted_label = label_mapping[predicted_class_id]
print(f"Text: {text}")
print(f"Predicted Sentiment: {predicted_label}")
```
## License
This model is licensed under the **Apache 2.0 License**. You are free to use, modify, and distribute this model in your applications.
## Citation
If you use this model in your research or applications, please cite it as:
```
@misc{AnkitAI_2024_financial_sentiment_model,
title={DistilBERT Fine-Tuned for Financial Sentiment Analysis},
author={Ankit Aglawe},
year={2024},
howpublished={\url{https://huggingface.co/AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis}},
}
```
## Acknowledgments
- **Hugging Face**: For providing the Transformers library and model hosting.
- **Data Providers**: Thanks to the creators of the Financial PhraseBank dataset.
- **Community**: Appreciation to the open-source community for continual support and contributions.
## Contact Information
For questions, feedback, or collaboration opportunities, please contact:
- **Name**: Ankit Aglawe
- **Email**: [aglawe.ankit@gmail.com]
- **GitHub**: [GitHub Profile](https://github.com/ankit-aglawe)
- **LinkedIn**: [LinkedIn Profile](https://www.linkedin.com/in/ankit-aglawe)
|