File size: 2,413 Bytes
485488a
 
 
 
 
 
 
 
 
856bc8d
485488a
 
 
856bc8d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
license: mit
language:
- ja
tags:
- PyTorch
- Transformers
---

## Japanese Stock Comment Sentiment Model

This model is a sentiment analysis tool specifically trained to analyze comments and discussions related to Japanese stocks. It is specialized in determining whether a comment has a bearish or bullish sentiment.
For its training, a large collection of individual stock-related comments was gathered, and these were categorized into two main categories: "bullish" and "bearish." This model can serve as a supportive tool for stock investors and market analysts in gathering information and making prompt decisions.

## How to use

### Part 1: Model Initialization

In this section, we'll be initializing the necessary components required for our prediction: the model and the tokenizer.

```python
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Load the model and tokenizer
model_path = "c299m/japanese_stock_sentiment"

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path).to(device)
```

### Part 2: Text Prediction

Once our model and tokenizer are initialized, we can move on to predicting the sentiment of a given text. The sentiment is classified into two categories: "bullish" (positive sentiment) or "bearish" (negative sentiment).

```python
import numpy as np
import torch.nn.functional as F

# Text for inference
sample_text = "\
材料良すぎてストップ安、、助けてクレステック、、、\
"

# Tokenize the text
inputs = tokenizer(sample_text, return_tensors="pt")

# Set the model to evaluation mode
model.eval()

# Execute the inference
with torch.no_grad():
    outputs = model(
        inputs["input_ids"].to(device),
        attention_mask=inputs["attention_mask"].to(device),
    )

# Obtain logits and apply softmax function to convert to probabilities
probabilities = F.softmax(outputs.logits, dim=1).cpu().numpy()

# Get the index of the class with the highest probability
y_preds = np.argmax(probabilities, axis=1)

# Convert the index to a label
def id2label(x):
    return model.config.id2label[x]

y_dash = [id2label(x) for x in y_preds]

# Get the probability of the most likely class
top_probs = probabilities[np.arange(len(y_preds)), y_preds]

print(y_dash, top_probs)
```