Add evaluation results on the sentences_allagree config and train split of financial_phrasebank

ee812da 9 months ago

7.67 kB

	---
	language: en
	tags:
	- financial-sentiment-analysis
	- sentiment-analysis
	datasets:
	- financial_phrasebank
	widget:
	- text: Operating profit rose to EUR 13.1 mn from EUR 8.7 mn in the corresponding
	period in 2007 representing 7.7 % of net sales.
	- text: Bids or offers include at least 1,000 shares and the value of the shares must
	correspond to at least EUR 4,000.
	- text: Raute reported a loss per share of EUR 0.86 for the first half of 2009 , against
	EPS of EUR 0.74 in the corresponding period of 2008.
	model-index:
	- name: ahmedrachid/FinancialBERT-Sentiment-Analysis
	results:
	- task:
	type: text-classification
	name: Text Classification
	dataset:
	name: financial_phrasebank
	type: financial_phrasebank
	config: sentences_allagree
	split: train
	metrics:
	- type: accuracy
	value: 0.9889575971731449
	name: Accuracy
	verified: true
	verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZWMyOTZhYTA3YjdjNDkwNWVjMGRlZGQxZDM1NTBmNGFkMWM0MzM2YTJiNzI4NzBjMzFiNTMwMzVkYTJmYmNlOCIsInZlcnNpb24iOjF9.9eOX4kC5HiagnTMpBp83H8ifgjzqwSa_tzLCjH8eMxRM6EKOhd9zWIYDtPWoKvNXpODjwRYLg38xKf09p6ZxCA
	- type: f1
	value: 0.9862110528444945
	name: F1 Macro
	verified: true
	verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNDBlNzhjZWU0YzIwMmIxMDkxNjk4NTkwNzA0N2RlODE5ZmNjMzVlYTBkZjJlYTlmODNiODcwMTNiZGRjYjE4NSIsInZlcnNpb24iOjF9.U_E-FCEFDIvzz7C1TWKRE0e9cSPlbV1VYy2SLAc1b-V3gonR1xUMosUwr99MTxsYSBaBAk9iyACXnefK_O45BQ
	- type: f1
	value: 0.9889575971731449
	name: F1 Micro
	verified: true
	verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOGY0NTM2YThkY2VlOTZlOGZlZWMxMTU0NmIzNzNkNjIzMGI2NDM1Mjk2MzFiM2Y4MTQ5MWJmNzQxM2JmNjY1MiIsInZlcnNpb24iOjF9.6xsjHU05UtDn6vTo39MTu0Rle6CNf75dgoWqMOegs6WAW3QC6ndHhQPSGm1LriQ14IQ5J_JYK01yVXoRn1MjCg
	- type: f1
	value: 0.9889906387631547
	name: F1 Weighted
	verified: true
	verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNGFmN2YwMjU1MDlkMTVjYjc5YWQ3MmQ2M2NlMWVjNWJlNDMxZjU4NTg4MjQ2NmFhZGE4OThhZjZiNjQ5N2E2OCIsInZlcnNpb24iOjF9.jvWFrjazySS_B9KZUexiATqObR826IP8eIT1O6eEZcu8GjiOCXcuNVlSfuqLFfysDWKpZXCbazSd9saUKloFCQ
	- type: precision
	value: 0.9854095875205817
	name: Precision Macro
	verified: true
	verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTVhNTI0OTVhYmUxZjAxNzZkMmY4NDIwOTVlOTQ1MjA4OTZjZTNmMWZiOTg4NmFhNzY1NDViZmE3ZDFhYTZjMyIsInZlcnNpb24iOjF9.zKeviEdhTqP5Y1BmtVaBMW_3nhSd-gfXwxMVjwnaUsZNxURWUKJfCe7MACdetVtnX7Jz6ZUSybZYaZ3obUqMCw
	- type: precision
	value: 0.9889575971731449
	name: Precision Micro
	verified: true
	verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZGQ3ZDViNTg0YzRlZTdlMjIyOWI5YTczNjZkZDJkNTZjNTQ5ZTc3YzY0NTI1YjMzMWQ5ODUxOWU2NzhmZjA4NSIsInZlcnNpb24iOjF9.Iaaol0A48I9ioGXYj8Tl0sWDQySxRlruUL3RiAR9NXureRbFQGuJBgF9Sd0WRrRe_0MFxkaOsXgkvBTh0u1IBg
	- type: precision
	value: 0.9891088373207723
	name: Precision Weighted
	verified: true
	verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMDljNGIzYzdhOGQzYjRlNDE2ZjUwY2NjNzRhYmE1NjM4YjVkNTIwYTIyMmE4ODM5MTZkNWM1YzY2ZmRkMTc2YSIsInZlcnNpb24iOjF9.-ZULRBdW0VbSr6e64WDdKW3Ny5qT38O2lH669cQSbwp30PjPPUFO4oXhDWm4QIOjI0NfOiTjrbLTVQ7gR0vABg
	- type: recall
	value: 0.987120462774644
	name: Recall Macro
	verified: true
	verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiODVlYTE5NGMxMGIyM2UxN2ZhNmRiODM1ODhkZmNhNTNmMzVhNjg3M2FlYTM2NTI0MGQyN2ViM2YzODI0M2I0YyIsInZlcnNpb24iOjF9.yDZFOIzW041-s6dWxaap--K0-6Hp52hc_6rIi8_f3E-Q52WcJNLL0VHMBo0g2I3cT7UVRoIqPYoRxNgyHaZnAw
	- type: recall
	value: 0.9889575971731449
	name: Recall Micro
	verified: true
	verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNmU4ZTg3MWEzNGZhMTY0MzQ1MjRmMTg1NTJmZjg0YWM3OGY4OGU5NWU0NmY0MmQ2YzZiNDYxMmFlNTNkZmUxYiIsInZlcnNpb24iOjF9.mvsikLjKldZ0SFThbAcygYEoJUNCQYE_bIbYyikMUHrSdY0BRlYsH5A32bu1BXAVMZVJVV9ebkSPmdKjZKIFAw
	- type: recall
	value: 0.9889575971731449
	name: Recall Weighted
	verified: true
	verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMDhjNDY2Mjk0NzQyN2NjYTIxZmI5YTE1YTBkOTkzOGI3YTlmZDA1MzgxMTY4MmY3MmRkNjI4OTg4OWNmNTI0NCIsInZlcnNpb24iOjF9.zUaL-986kOJjv_VtlJAlvuEq0AxxlZaISlsmNFgvjifiFRpfPx5_-mKLkbsFjkS2q-_MQ8jTMMpQoiTVbaJMAA
	- type: loss
	value: 0.05342382565140724
	name: loss
	verified: true
	verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTZkNzJmYWM2MzExM2YzOTUzMzJkZmIyOGNhMjNkZTU3NWRlOWEyMWE5ZGY4MDU3Yjk2MTU4NTExMTg0M2I4ZCIsInZlcnNpb24iOjF9.cwtia03w0NY4FPTj9doI3S45t50HyhjNEttRg7tcr00vA5y_6xEak7OKMXkGQZ2noribvuRyf4218STYNTHlAQ
	---
	### FinancialBERT for Sentiment Analysis

	[FinancialBERT](https://huggingface.co/ahmedrachid/FinancialBERT) is a BERT model pre-trained on a large corpora of financial texts. The purpose is to enhance financial NLP research and practice in financial domain, hoping that financial practitioners and researchers can benefit from this model without the necessity of the significant computational resources required to train the model.

	The model was fine-tuned for Sentiment Analysis task on _Financial PhraseBank_ dataset. Experiments show that this model outperforms the general BERT and other financial domain-specific models.

	More details on `FinancialBERT`'s pre-training process can be found at: https://www.researchgate.net/publication/358284785_FinancialBERT_-_A_Pretrained_Language_Model_for_Financial_Text_Mining

	### Training data
	FinancialBERT model was fine-tuned on [Financial PhraseBank](https://www.researchgate.net/publication/251231364_FinancialPhraseBank-v10), a dataset consisting of 4840 Financial News categorised by sentiment (negative, neutral, positive).

	### Fine-tuning hyper-parameters
	- learning_rate = 2e-5
	- batch_size = 32
	- max_seq_length = 512
	- num_train_epochs = 5

	### Evaluation metrics
	The evaluation metrics used are: Precision, Recall and F1-score. The following is the classification report on the test set.

	\| sentiment \| precision \| recall \| f1-score \| support \|
	\| ------------- \|:-------------:\|:-------------:\|:-------------:\| -----:\|
	\| negative \| 0.96 \| 0.97 \| 0.97 \| 58 \|
	\| neutral \| 0.98 \| 0.99 \| 0.98 \| 279 \|
	\| positive \| 0.98 \| 0.97 \| 0.97 \| 148 \|
	\| macro avg \| 0.97 \| 0.98 \| 0.98 \| 485 \|
	\| weighted avg \| 0.98 \| 0.98 \| 0.98 \| 485 \|

	### How to use
	The model can be used thanks to Transformers pipeline for sentiment analysis.
	```python
	from transformers import BertTokenizer, BertForSequenceClassification
	from transformers import pipeline

	model = BertForSequenceClassification.from_pretrained("ahmedrachid/FinancialBERT-Sentiment-Analysis",num_labels=3)
	tokenizer = BertTokenizer.from_pretrained("ahmedrachid/FinancialBERT-Sentiment-Analysis")

	nlp = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)

	sentences = ["Operating profit rose to EUR 13.1 mn from EUR 8.7 mn in the corresponding period in 2007 representing 7.7 % of net sales.",
	"Bids or offers include at least 1,000 shares and the value of the shares must correspond to at least EUR 4,000.",
	"Raute reported a loss per share of EUR 0.86 for the first half of 2009 , against EPS of EUR 0.74 in the corresponding period of 2008.",
	]
	results = nlp(sentences)
	print(results)

	[{'label': 'positive', 'score': 0.9998133778572083},
	{'label': 'neutral', 'score': 0.9997822642326355},
	{'label': 'negative', 'score': 0.9877365231513977}]
	```

	> Created by [Ahmed Rachid Hazourli](https://www.linkedin.com/in/ahmed-rachid/)