Migaku commited on
Commit
856bc8d
โ€ข
1 Parent(s): 141c4fc

Update README

Browse files
.ipynb_checkpoints/README-checkpoint.md CHANGED
@@ -7,7 +7,69 @@ tags:
7
  - Transformers
8
  ---
9
 
10
- Japanese Stock Comment Sentiment Model
11
 
12
  This model is a sentiment analysis tool specifically trained to analyze comments and discussions related to Japanese stocks. It is specialized in determining whether a comment has a bearish or bullish sentiment.
13
  For its training, a large collection of individual stock-related comments was gathered, and these were categorized into two main categories: "bullish" and "bearish." This model can serve as a supportive tool for stock investors and market analysts in gathering information and making prompt decisions.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - Transformers
8
  ---
9
 
10
+ ## Japanese Stock Comment Sentiment Model
11
 
12
  This model is a sentiment analysis tool specifically trained to analyze comments and discussions related to Japanese stocks. It is specialized in determining whether a comment has a bearish or bullish sentiment.
13
  For its training, a large collection of individual stock-related comments was gathered, and these were categorized into two main categories: "bullish" and "bearish." This model can serve as a supportive tool for stock investors and market analysts in gathering information and making prompt decisions.
14
+
15
+ ## How to use
16
+
17
+ ### Part 1: Model Initialization
18
+
19
+ In this section, we'll be initializing the necessary components required for our prediction: the model and the tokenizer.
20
+
21
+ ```python
22
+ import torch
23
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
24
+
25
+ # Load the model and tokenizer
26
+ model_path = "c299m/japanese_stock_sentiment"
27
+
28
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
29
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
30
+ model = AutoModelForSequenceClassification.from_pretrained(model_path).to(device)
31
+ ```
32
+
33
+ ### Part 2: Text Prediction
34
+
35
+ Once our model and tokenizer are initialized, we can move on to predicting the sentiment of a given text. The sentiment is classified into two categories: "bullish" (positive sentiment) or "bearish" (negative sentiment).
36
+
37
+ ```python
38
+ import numpy as np
39
+ import torch.nn.functional as F
40
+
41
+ # Text for inference
42
+ sample_text = "\
43
+ ๆๆ–™่‰ฏใ™ใŽใฆใ‚นใƒˆใƒƒใƒ—ๅฎ‰ใ€ใ€ๅŠฉใ‘ใฆใ‚ฏใƒฌใ‚นใƒ†ใƒƒใ‚ฏใ€ใ€ใ€\
44
+ "
45
+
46
+ # Tokenize the text
47
+ inputs = tokenizer(sample_text, return_tensors="pt")
48
+
49
+ # Set the model to evaluation mode
50
+ model.eval()
51
+
52
+ # Execute the inference
53
+ with torch.no_grad():
54
+ outputs = model(
55
+ inputs["input_ids"].to(device),
56
+ attention_mask=inputs["attention_mask"].to(device),
57
+ )
58
+
59
+ # Obtain logits and apply softmax function to convert to probabilities
60
+ probabilities = F.softmax(outputs.logits, dim=1).cpu().numpy()
61
+
62
+ # Get the index of the class with the highest probability
63
+ y_preds = np.argmax(probabilities, axis=1)
64
+
65
+ # Convert the index to a label
66
+ def id2label(x):
67
+ return model.config.id2label[x]
68
+
69
+ y_dash = [id2label(x) for x in y_preds]
70
+
71
+ # Get the probability of the most likely class
72
+ top_probs = probabilities[np.arange(len(y_preds)), y_preds]
73
+
74
+ print(y_dash, top_probs)
75
+ ```
README.md CHANGED
@@ -7,7 +7,69 @@ tags:
7
  - Transformers
8
  ---
9
 
10
- Japanese Stock Comment Sentiment Model
11
 
12
  This model is a sentiment analysis tool specifically trained to analyze comments and discussions related to Japanese stocks. It is specialized in determining whether a comment has a bearish or bullish sentiment.
13
  For its training, a large collection of individual stock-related comments was gathered, and these were categorized into two main categories: "bullish" and "bearish." This model can serve as a supportive tool for stock investors and market analysts in gathering information and making prompt decisions.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - Transformers
8
  ---
9
 
10
+ ## Japanese Stock Comment Sentiment Model
11
 
12
  This model is a sentiment analysis tool specifically trained to analyze comments and discussions related to Japanese stocks. It is specialized in determining whether a comment has a bearish or bullish sentiment.
13
  For its training, a large collection of individual stock-related comments was gathered, and these were categorized into two main categories: "bullish" and "bearish." This model can serve as a supportive tool for stock investors and market analysts in gathering information and making prompt decisions.
14
+
15
+ ## How to use
16
+
17
+ ### Part 1: Model Initialization
18
+
19
+ In this section, we'll be initializing the necessary components required for our prediction: the model and the tokenizer.
20
+
21
+ ```python
22
+ import torch
23
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
24
+
25
+ # Load the model and tokenizer
26
+ model_path = "c299m/japanese_stock_sentiment"
27
+
28
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
29
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
30
+ model = AutoModelForSequenceClassification.from_pretrained(model_path).to(device)
31
+ ```
32
+
33
+ ### Part 2: Text Prediction
34
+
35
+ Once our model and tokenizer are initialized, we can move on to predicting the sentiment of a given text. The sentiment is classified into two categories: "bullish" (positive sentiment) or "bearish" (negative sentiment).
36
+
37
+ ```python
38
+ import numpy as np
39
+ import torch.nn.functional as F
40
+
41
+ # Text for inference
42
+ sample_text = "\
43
+ ๆๆ–™่‰ฏใ™ใŽใฆใ‚นใƒˆใƒƒใƒ—ๅฎ‰ใ€ใ€ๅŠฉใ‘ใฆใ‚ฏใƒฌใ‚นใƒ†ใƒƒใ‚ฏใ€ใ€ใ€\
44
+ "
45
+
46
+ # Tokenize the text
47
+ inputs = tokenizer(sample_text, return_tensors="pt")
48
+
49
+ # Set the model to evaluation mode
50
+ model.eval()
51
+
52
+ # Execute the inference
53
+ with torch.no_grad():
54
+ outputs = model(
55
+ inputs["input_ids"].to(device),
56
+ attention_mask=inputs["attention_mask"].to(device),
57
+ )
58
+
59
+ # Obtain logits and apply softmax function to convert to probabilities
60
+ probabilities = F.softmax(outputs.logits, dim=1).cpu().numpy()
61
+
62
+ # Get the index of the class with the highest probability
63
+ y_preds = np.argmax(probabilities, axis=1)
64
+
65
+ # Convert the index to a label
66
+ def id2label(x):
67
+ return model.config.id2label[x]
68
+
69
+ y_dash = [id2label(x) for x in y_preds]
70
+
71
+ # Get the probability of the most likely class
72
+ top_probs = probabilities[np.arange(len(y_preds)), y_preds]
73
+
74
+ print(y_dash, top_probs)
75
+ ```