Edit model card

distilbert-base-uncased-sentiment-reddit-crypto

This model is a fine-tuned version of distilbert-base-uncased on the crypto-related reddit comments dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3070
  • Accuracy: 0.8915

Accuracy on the final test set was: 0.8641

Training and evaluation data

Training and validation data collected from 2 sources:

  1. Kaggle reddit cryptocurrency posts and comments
  2. Kaggle reddit cryptocurrency related posts from various subreddits. Comments from subreddits: 'cryptocurrency', 'bitcoin', 'ethereum', 'dogecoin' were extracted.

Final test data source is from https://www.surgehq.ai/datasets/crypto-sentiment-dataset.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.2823 1.0 5109 0.2658 0.8840
0.1905 2.0 10218 0.3070 0.8915

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.1+cu116
  • Datasets 2.8.0
  • Tokenizers 0.13.2
Downloads last month
186
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.