ElKulako commited on
Commit
c47aef6
1 Parent(s): 31b8743

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -5
README.md CHANGED
@@ -17,18 +17,25 @@ CryptoBERT was trained with a max sequence length of 128. Technically, it can ha
17
  # Classification Example
18
  ```python
19
  from transformers import TextClassificationPipeline, AutoModelForSequenceClassification, AutoTokenizer
20
- from datasets import load_dataset
21
- dataset_name = "ElKulako/stocktwits-crypto"
22
- dataset = load_dataset(dataset_name)
23
  model_name = "ElKulako/cryptobert"
24
- tokenizer_ = AutoTokenizer.from_pretrained(model_name, use_fast=True)
25
  model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels = 3)
26
- pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, batch_size=64, max_length=64, truncation=True, padding = 'max_length')
 
 
 
 
 
27
  preds = pipe(df_posts)
 
28
 
29
 
30
  ```
31
 
 
 
 
 
32
  ## Training Corpus
33
  CryptoBERT was trained on 3.2M social media posts regarding various cryptocurrencies. Only non-duplicate posts of length above 4 words were considered. The following communities were used as sources for our corpora:
34
 
 
17
  # Classification Example
18
  ```python
19
  from transformers import TextClassificationPipeline, AutoModelForSequenceClassification, AutoTokenizer
 
 
 
20
  model_name = "ElKulako/cryptobert"
21
+ tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
22
  model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels = 3)
23
+ pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, max_length=64, truncation=True, padding = 'max_length')
24
+ # post_1 & post_3 = bullish, post_2 = bearish
25
+ post_1 = " see y'all tomorrow and can't wait to see ada in the morning, i wonder what price it is going to be at. 😎🐂🤠💯😴, bitcoin is looking good go for it and flash by that 45k. "
26
+ post_2 = " alright racers, it’s a race to the bottom! good luck today and remember there are no losers (minus those who invested in currency nobody really uses) take your marks... are you ready? go!!"
27
+ post_3 = " i'm never selling. the whole market can bottom out. i'll continue to hold this dumpster fire until the day i die if i need to."
28
+ df_posts = [post_1, post_2, post_3]
29
  preds = pipe(df_posts)
30
+ print(preds)
31
 
32
 
33
  ```
34
 
35
+ ```
36
+ [{'label': 'Bullish', 'score': 0.8734585642814636}, {'label': 'Bearish', 'score': 0.9889495372772217}, {'label': 'Bullish', 'score': 0.6595883965492249}]
37
+ ```
38
+
39
  ## Training Corpus
40
  CryptoBERT was trained on 3.2M social media posts regarding various cryptocurrencies. Only non-duplicate posts of length above 4 words were considered. The following communities were used as sources for our corpora:
41