Update data sources
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
14 |
|
15 |
# distilbert-base-uncased-sentiment-reddit-crypto
|
16 |
|
17 |
-
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the
|
18 |
It achieves the following results on the evaluation set:
|
19 |
- Loss: 0.3070
|
20 |
- Accuracy: 0.8915
|
@@ -29,7 +29,12 @@ More information needed
|
|
29 |
|
30 |
## Training and evaluation data
|
31 |
|
32 |
-
|
|
|
|
|
|
|
|
|
|
|
33 |
|
34 |
## Training procedure
|
35 |
|
|
|
14 |
|
15 |
# distilbert-base-uncased-sentiment-reddit-crypto
|
16 |
|
17 |
+
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the crypto-related reddit comments dataset.
|
18 |
It achieves the following results on the evaluation set:
|
19 |
- Loss: 0.3070
|
20 |
- Accuracy: 0.8915
|
|
|
29 |
|
30 |
## Training and evaluation data
|
31 |
|
32 |
+
Training and validation data collected from 2 sources:
|
33 |
+
|
34 |
+
1. [Kaggle reddit cryptocurrency posts and comments](https://www.kaggle.com/datasets/gpreda/reddit-cryptocurrency)
|
35 |
+
2. [Kaggle reddit cryptocurrency related posts from various subreddits](https://www.kaggle.com/datasets/leukipp/reddit-crypto-data). Comments from subreddits: `'cryptocurrency', 'bitcoin', 'ethereum', 'dogecoin'` were extracted.
|
36 |
+
|
37 |
+
Final test data source is from https://www.surgehq.ai/datasets/crypto-sentiment-dataset.
|
38 |
|
39 |
## Training procedure
|
40 |
|