Kevinmuhic1 yiyanghkust commited on
Commit
efbbe4c
0 Parent(s):

Duplicate from yiyanghkust/finbert-tone

Browse files

Co-authored-by: Yi <yiyanghkust@users.noreply.huggingface.co>

Files changed (6) hide show
  1. .gitattributes +17 -0
  2. README.md +42 -0
  3. config.json +26 -0
  4. pytorch_model.bin +3 -0
  5. tf_model.h5 +3 -0
  6. vocab.txt +0 -0
.gitattributes ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.bin.* filter=lfs diff=lfs merge=lfs -text
2
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.h5 filter=lfs diff=lfs merge=lfs -text
5
+ *.tflite filter=lfs diff=lfs merge=lfs -text
6
+ *.tar.gz filter=lfs diff=lfs merge=lfs -text
7
+ *.ot filter=lfs diff=lfs merge=lfs -text
8
+ *.onnx filter=lfs diff=lfs merge=lfs -text
9
+ *.arrow filter=lfs diff=lfs merge=lfs -text
10
+ *.ftz filter=lfs diff=lfs merge=lfs -text
11
+ *.joblib filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.pb filter=lfs diff=lfs merge=lfs -text
15
+ *.pt filter=lfs diff=lfs merge=lfs -text
16
+ *.pth filter=lfs diff=lfs merge=lfs -text
17
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "en"
3
+ tags:
4
+ - financial-sentiment-analysis
5
+ - sentiment-analysis
6
+ widget:
7
+ - text: "growth is strong and we have plenty of liquidity"
8
+ ---
9
+
10
+ `FinBERT` is a BERT model pre-trained on financial communication text. The purpose is to enhance financial NLP research and practice. It is trained on the following three financial communication corpus. The total corpora size is 4.9B tokens.
11
+ - Corporate Reports 10-K & 10-Q: 2.5B tokens
12
+ - Earnings Call Transcripts: 1.3B tokens
13
+ - Analyst Reports: 1.1B tokens
14
+
15
+ More technical details on `FinBERT`: [Click Link](https://github.com/yya518/FinBERT)
16
+
17
+ This released `finbert-tone` model is the `FinBERT` model fine-tuned on 10,000 manually annotated (positive, negative, neutral) sentences from analyst reports. This model achieves superior performance on financial tone analysis task. If you are simply interested in using `FinBERT` for financial tone analysis, give it a try.
18
+
19
+ If you use the model in your academic work, please cite the following paper:
20
+
21
+ Huang, Allen H., Hui Wang, and Yi Yang. "FinBERT: A Large Language Model for Extracting Information from Financial Text." *Contemporary Accounting Research* (2022).
22
+
23
+
24
+ # How to use
25
+ You can use this model with Transformers pipeline for sentiment analysis.
26
+ ```python
27
+ from transformers import BertTokenizer, BertForSequenceClassification
28
+ from transformers import pipeline
29
+
30
+ finbert = BertForSequenceClassification.from_pretrained('yiyanghkust/finbert-tone',num_labels=3)
31
+ tokenizer = BertTokenizer.from_pretrained('yiyanghkust/finbert-tone')
32
+
33
+ nlp = pipeline("sentiment-analysis", model=finbert, tokenizer=tokenizer)
34
+
35
+ sentences = ["there is a shortage of capital, and we need extra financing",
36
+ "growth is strong and we have plenty of liquidity",
37
+ "there are doubts about our finances",
38
+ "profits are flat"]
39
+ results = nlp(sentences)
40
+ print(results) #LABEL_0: neutral; LABEL_1: positive; LABEL_2: negative
41
+
42
+ ```
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertForSequenceClassification"
4
+ ],
5
+ "id2label": {
6
+ "0": "Neutral",
7
+ "1": "Positive",
8
+ "2": "Negative"
9
+ },
10
+ "label2id": {
11
+ "Positive": 1,
12
+ "Negative": 2,
13
+ "Neutral": 0
14
+ },
15
+ "attention_probs_dropout_prob": 0.1,
16
+ "hidden_act": "gelu",
17
+ "hidden_dropout_prob": 0.1,
18
+ "hidden_size": 768,
19
+ "initializer_range": 0.02,
20
+ "intermediate_size": 3072,
21
+ "max_position_embeddings": 512,
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 12,
24
+ "type_vocab_size": 2,
25
+ "vocab_size": 30873
26
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f31c2036e91c9854bcc35141d16669dd07b9726adfe391d1011bff1de7ea4b32
3
+ size 439101405
tf_model.h5 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:37d45bf7c11607c5df782ef613f711ccf8b5a9af5ae6ceffc2e5da8aae191096
3
+ size 439304476
vocab.txt ADDED
The diff for this file is too large to render. See raw diff