File size: 2,851 Bytes
14c6561
 
ffe64b8
 
 
 
14c6561
 
 
 
 
30e27d7
 
14c6561
 
e61d990
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
language: en
tags:
  - financial
  - stocks
  - sentiment
datasets:
- Jean-Baptiste/financial_news_sentiment_mixte_with_phrasebank_75
widget:
- text: "Fortuna Silver Mines Inc. (NYSE: FSM) (TSX: FVI) reports solid production results for the third quarter of 2022 from its four operating mines in the Americas and West Africa."
- text: "Theratechnologies Reports Financial Results for the Third Quarter of Fiscal 2022 and Provides Business Update -- - Q3 2022 Consolidated Revenue Growth of 17% to $20.8 million"
- text: "The Flowr Corporation To File for CCAA Protection"
- text: "Melcor REIT (TSX: MR.UN) today announced results for the third quarter ended September 30, 2022. Revenue was stable in the quarter and year-to-date. Net operating income was down 3% in the quarter at $11.61 million due to the timing of operating expenses and inflated costs including utilities like gas/heat and power"

license: mit
---

# Model fine-tuned from roberta-large for sentiment classification of financial news (emphasis on Canadian news).

### Introduction
This model was train on financial_news_sentiment_mixte_with_phrasebank_75 dataset.
This is a customized version of the phrasebank dataset in which I kept only sentence validated by at least 75% annotators. 
In addition I added ~2000 articles validated manually on Canadian financial news. Therefore the model is more specifically trained for Canadian news.
Final result is f1 score of 93.25% overall and 83.6% on Canadian news.




### Training data
Training data was classified as follow:

class |Description
-|-
0 |negative
1 |neutral
2 |positive


### How to use roberta-large-financial-news-sentiment-en with HuggingFace

##### Load roberta-large-financial-news-sentiment-en and its sub-word tokenizer :

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")
model = AutoModelForSequenceClassification.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")


##### Process text sample (from wikipedia)

from transformers import pipeline

pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
pipe("Melcor REIT (TSX: MR.UN) today announced results for the third quarter ended September 30, 2022. Revenue was stable in the quarter and year-to-date. Net operating income was down 3% in the quarter at $11.61 million due to the timing of operating expenses and inflated costs including utilities like gas/heat and power")

[{'label': 'negative', 'score': 0.9399105906486511}]

```

### Model performances

Overall f1 score (average macro)

precision|recall|f1
-|-|-
0.9355|0.9299|0.9325

By entity

entity|precision|recall|f1
-|-|-|-
negative|0.9605|0.9240|0.9419 
neutral|0.9538|0.9459|0.9498
positive|0.8922|0.9200|0.9059