--- language: - en license: apache-2.0 tags: - "finance" - "sentiment analysis" - "regression" - "sentence bert" datasets: - "RavenPack" metrics: - "rmse" --- # FinEAS: Financial Embedding Analysis of Sentiment SentenceBERT for Financial News Sentiment Regression **DISCLAIMER:** This model has been successfully tested with a test set of the same distribution. However, it is **not** a production-ready model as it probably needs to be updated continuously. Furthermore, the model should have been trained with more than two years of historical data. Additionally, it would need a supplementary assessment on bias, security and consistency. ## Introduction Analyzing the sentiment of financial news is a complex task that requires a large understanding of the financial slang, as well as the knowledge of the context of each one of the companies, and the interactions of the whole economy as an ecosystem. The [FinBERT](https://huggingface.co/ProsusAI/finbert) model binary classifies the sentiment being positive or negative. However, the idea of binary classification is too simple and does not comply with the reality. RavenPack has an excellent hand-labelled large dataset with a continuous sentiment label variable that goes from -1 to 1. We have collected data from two previous years and tested it with data from the next two weeks. Additionally we have cut the dataset taking only both one year and six months subsamples to see how the model scales with more data, and to know whether more data helps the model or not. In this repository you can find the different models by changing the branch name. The main branch is the one with the model trained on the whole dataset. We also uploaded the FinBERT regressor to the Hub: https://huggingface.co/LHF/finbert-regressor **Note that the predictions of this HF model will go from 0 to 1 being 0.5 neutral, 1 positive and 0 negative.** ## Evaluation | Dates | FinEAS | FinBERT | |-----------|--------|---------| | 6 months | 0.0044 | 0.0050 | | 12 months | 0.0036 | 0.0034 | | 24 months | 0.0033 | 0.0040 | *Evaluated with the next two weeks. ## Code You can find the code for this model in the following link: https://github.com/lhf-labs/finance-news-analysis-bert ## Citation ``` @misc{gutierrezfandino2021fineas, title={FinEAS: Financial Embedding Analysis of Sentiment}, author={Asier Gutiérrez-Fandiño and Miquel Noguer i Alonso and Petter Kolm and Jordi Armengol-Estapé}, year={2021}, eprint={2111.00526}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```