Jean-Baptiste commited on
Commit
761ba0d
1 Parent(s): 8592cc4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -0
README.md ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ datasets:
4
+ - Jean-Baptiste/financial_news_sentiment_mixte_with_phrasebank_75
5
+ widget:
6
+ - text: "LexaGene Receives Signed Quote from Large Biopharma Company to Purchase a MiQLab System -- LexaGene Holdings, Inc., (OTCQB: LXXGF; TSX-V: LXG) (“LexaGene” or the “Company”), an innovative, molecular diagnostics company that has commercialized the MiQLab® System for automated, genetic testing, is pleased to announce that it has received an indication that a major biopharma company intends to purchase its technology."
7
+ - text: "Melcor REIT (TSX: MR.UN) today announced results for the third quarter ended September 30, 2022. Revenue was stable in the quarter and year-to-date. Net operating income was down 3% in the quarter at $11.61 million due to the timing of operating expenses and inflated costs including utilities like gas/heat and power"
8
+ - text: "Badger Infrastructure Solutions Ltd. Announces Resignation of Chief Financial Officer and Appointment of Interim Chief Financial Officer -- Badger Infrastructure Solutions Ltd. (“Badger” or the “Company”) (TSX:BDGI) announced today the resignation of Mr. Darren Yaworsky, Senior Vice President, Finance & Chief Financial Officer and the appointment of Mr. Pramod Bhatia as interim Chief Financial Officer. Mr. Yaworsky will remain with the Company until December 31, 2022 to facilitate an orderly transition."
9
+ license: mit
10
+ ---
11
+
12
+ # Model fine-tuned from roberta-large for topic classification of financial news (emphasis on Canadian news).
13
+
14
+ ### Introduction
15
+ This model was train on the topic column of financial_news_sentiment_mixte_with_phrasebank_75 dataset.
16
+ The topic column was generated using a zero-shot classification model on 11 topics.
17
+ There was no manual reviews on the generated topics and therefore we should expect misclassifications in the dataset,
18
+ and therefore the trained model will reproduce these bias.
19
+
20
+
21
+ ### Training data
22
+ Training data was classified as follow:
23
+
24
+ class |Description
25
+ -|-
26
+ 0 |acquisition
27
+ 1 |other
28
+ 2 |quaterly financial release
29
+ 3 |appointment to new position
30
+ 4 |dividend
31
+ 5 |corporate update
32
+ 6 |drillings results
33
+ 7 |conference
34
+ 8 |share repurchase program
35
+ 9 |grant of stocks
36
+
37
+
38
+ ### How to use roberta-large-financial-news-topics-en with HuggingFace
39
+
40
+ ##### Load roberta-large-financial-news-topics-en and its sub-word tokenizer :
41
+
42
+ ```python
43
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
44
+
45
+ tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/roberta-large-financial-news-topics-en")
46
+ model = AutoModelForSequenceClassification.from_pretrained("Jean-Baptiste/roberta-large-financial-news-topics-en")
47
+
48
+
49
+ ##### Process text sample (from wikipedia)
50
+
51
+ from transformers import pipeline
52
+
53
+ pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
54
+ pipe("Melcor REIT (TSX: MR.UN) today announced results for the third quarter ended September 30, 2022. Revenue was stable in the quarter and year-to-date. Net operating income was down 3% in the quarter at $11.61 million due to the timing of operating expenses and inflated costs including utilities like gas/heat and power")
55
+
56
+ [{'label': 'quaterly financial release', 'score': 0.8829097151756287}]
57
+
58
+ ```
59
+
60
+ ### Model performances
61
+
62
+ Overall f1 score (average macro)
63
+
64
+ precision|recall|f1
65
+ -|-|-
66
+ 0.9355|0.9299|0.9325
67
+
68
+ By entity
69
+
70
+ entity|precision|recall|f1
71
+ -|-|-|-
72
+ negative|0.9605|0.9240|0.9419
73
+ neutral|0.9538|0.9459|0.9498
74
+ positive|0.8922|0.9200|0.9059