Update README.md
Browse files
README.md
CHANGED
@@ -2,8 +2,11 @@
|
|
2 |
CryptoBERT is a pre-trained NLP model to analyse the language and sentiments of cryptocurrency-related social media posts and messages. It is built by further training the [cardiffnlp's Twitter-roBERTa-base](https://huggingface.co/cardiffnlp/twitter-roberta-base) language model on the cryptocurrency domain, using a corpus of over 3.2M unique cryptocurrency-related social media posts.
|
3 |
|
4 |
|
|
|
|
|
|
|
5 |
## Training Corpus
|
6 |
-
CryptoBERT was trained on 3.2M social media posts
|
7 |
|
8 |
|
9 |
(1) StockTwits - 1.875M posts about the top 100 cryptos by trading volume. Posts were collected from the 1st of November 2021 to the 16th of June 2022.
|
|
|
2 |
CryptoBERT is a pre-trained NLP model to analyse the language and sentiments of cryptocurrency-related social media posts and messages. It is built by further training the [cardiffnlp's Twitter-roBERTa-base](https://huggingface.co/cardiffnlp/twitter-roberta-base) language model on the cryptocurrency domain, using a corpus of over 3.2M unique cryptocurrency-related social media posts.
|
3 |
|
4 |
|
5 |
+
## Classification Training
|
6 |
+
CryptoBERT's sentiment classification head was fine-tuned on
|
7 |
+
|
8 |
## Training Corpus
|
9 |
+
CryptoBERT was trained on 3.2M social media posts regarding various cryptocurrencies. Only non-duplicate posts of length above 4 words were considered. The following communities were used as sources for our corpora:
|
10 |
|
11 |
|
12 |
(1) StockTwits - 1.875M posts about the top 100 cryptos by trading volume. Posts were collected from the 1st of November 2021 to the 16th of June 2022.
|