Edit model card

FinTwitBERT

FinTwitBERT is a language model specifically pre-trained on a large dataset of financial tweets. This specialized BERT model aims to capture the unique jargon and communication style found in the financial Twitter sphere, making it an ideal tool for sentiment analysis, trend prediction, and other financial NLP tasks.

Sentiment Analysis

The FinTwitBERT-sentiment model leverages FinTwitBERT for the sentiment analysis of financial tweets, offering nuanced insights into the prevailing market sentiments.

Dataset

FinTwitBERT is pre-trained on several financial tweets datasets, consisting of tweets mentioning stocks and cryptocurrencies:

Model Details

Based on the FinBERT model and tokenizer, FinTwitBERT includes additional masks (@USER and [URL]) to handle common elements in tweets. The model underwent 10 epochs of pre-training, with early stopping to prevent overfitting.

More Information

For a comprehensive overview, including the complete training setup details and more, visit the FinTwitBERT GitHub repository.

Usage

Using HuggingFace's transformers library the model and tokenizers can be converted into a pipeline for masked language modeling.

from transformers import pipeline

pipe = pipeline(
    "fill-mask",
    model="StephanAkkerman/FinTwitBERT",
)
print(pipe("Bitcoin is a [MASK] coin."))

Citing & Authors

If you use FinTwitBERT or FinTwitBERT-sentiment in your research, please cite us as follows, noting that both authors contributed equally to this work:

@misc{FinTwitBERT,
  author = {Stephan Akkerman, Tim Koornstra},
  title = {FinTwitBERT: A Specialized Language Model for Financial Tweets},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/TimKoornstra/FinTwitBERT}}
}

Additionally, if you utilize the sentiment classifier, please cite:

@misc{FinTwitBERT-sentiment,
  author = {Stephan Akkerman, Tim Koornstra},
  title = {FinTwitBERT-sentiment: A Sentiment Classifier for Financial Tweets},
  year = {2023},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/StephanAkkerman/FinTwitBERT-sentiment}}
}

License

This project is licensed under the MIT License. See the LICENSE file for details.

Downloads last month
43
Safetensors
Model size
110M params
Tensor type
F32
·

Finetuned from

Datasets used to train StephanAkkerman/FinTwitBERT

Space using StephanAkkerman/FinTwitBERT 1

Evaluation results