README.md · ahmedrachid/FinancialBERT-Sentiment-Analysis at a1321da3a2b9e42c249b5a0ef054dc9c701c102b

metadata

language: en
tags:
  - financial-sentiment-analysis
  - sentiment-analysis
datasets:
  - financial_phrasebank
widget:
  - text: >-
      Operating profit rose to EUR 13.1 mn from EUR 8.7 mn in the corresponding
      period in 2007 representing 7.7 % of net sales.
  - text: >-
      Bids or offers include at least 1,000 shares and the value of the shares
      must correspond to at least EUR 4,000.
  - text: >-
      Raute reported a loss per share of EUR 0.86 for the first half of 2009 ,
      against EPS of EUR 0.74 in the corresponding period of 2008.

FinancialBERT for Sentiment Analysis

FinancialBERT is a BERT model pre-trained on a large corpora of financial texts. The purpose is to enhance financial NLP research and practice in financial domain, we hope financial practitioners and researchers can benefit from our model without the necessity of the significant computational resources required to train the model.

Our model was fine-tuned for Sentiment Analysis task on FinancialPhraseBank dataset, experiments show that our model outperforms the general BERT and other financial domain-specific models.

Training data

FinancialBERT model was fine-tuned on Financial PhraseBank, a dataset consisting of 4840 Financial News categorised by sentiment (negative, neutral, positive).

Fine-tuning hyper-parameters

learning_rate = 2e-5
batch_size = 32
max_seq_length = 512
num_train_epochs = 5

Metrics

The evaluation metrics used are: Precision, Recall and F1-score. The following is the classification report on the test set.

relation	precision	recall	f1-score	support
has	0.7416	0.9674	0.8396	2362
is in	0.7813	0.7925	0.7869	2362
is	0.8650	0.6863	0.7653	2362
are	0.8365	0.8493	0.8429	2362
x	0.9515	0.8302	0.8867	2362

macro avg	0.8352	0.8251	0.8243	11810
weighted avg	0.8352	0.8251	0.8243	11810

How to use

Our model can be used thanks to Transformers pipeline for sentiment analysis.

>>> from transformers import BertTokenizer, BertForSequenceClassification
>>> from transformers import pipeline

>>> model = BertForSequenceClassification.from_pretrained("ahmedrachid/FinancialBERT-Sentiment-Analysis",num_labels=3)
>>> tokenizer = BertTokenizer.from_pretrained("ahmedrachid/FinancialBERT-Sentiment-Analysis")

>>> nlp = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)

>>> sentences = ["Operating profit rose to EUR 13.1 mn from EUR 8.7 mn in the corresponding period in 2007 representing 7.7 % of net sales.",  
             "Bids or offers include at least 1,000 shares and the value of the shares must correspond to at least EUR 4,000.", 
             "Raute reported a loss per share of EUR 0.86 for the first half of 2009 , against EPS of EUR 0.74 in the corresponding period of 2008.", 
             ]
>>> results = nlp(sentences)
>>> print(results)

[{'label': 'positive', 'score': 0.9998133778572083},
 {'label': 'neutral', 'score': 0.9997822642326355},
 {'label': 'negative', 'score': 0.9877365231513977}]