FinBERT is a BERT model pre-trained on financial communication text. The purpose is to enhance financial NLP research and practice. It is trained on the following three financial communication corpus. The total corpora size is 4.9B tokens.
- Corporate Reports 10-K & 10-Q: 2.5B tokens
- Earnings Call Transcripts: 1.3B tokens
- Analyst Reports: 1.1B tokens
More technical details on
FinBERT: Click Link
Please check out our working paper
FinBERT—A Deep Learning Approach to Extracting Textual Information.
finbert-tone model is the
FinBERT model fine-tuned on 10,000 manually annotated (positive, negative, neutral) sentences from analyst reports. This model achieves superior performance on financial tone analysis task. If you are simply interested in using
FinBERT for financial tone analysis, give it a try.
You can use this model with Transformers pipeline for sentiment analysis.
from transformers import BertTokenizer, BertForSequenceClassification from transformers import pipeline finbert = BertForSequenceClassification.from_pretrained('yiyanghkust/finbert-tone',num_labels=3) tokenizer = BertTokenizer.from_pretrained('yiyanghkust/finbert-tone') nlp = pipeline("sentiment-analysis", model=finbert, tokenizer=tokenizer) sentences = ["there is a shortage of capital, and we need extra financing", "growth is strong and we have plenty of liquidity", "there are doubts about our finances", "profits are flat"] results = nlp(sentences) print(results) #LABEL_0: neutral; LABEL_1: positive; LABEL_2: negative
- Downloads last month