--- license: mit language: - ja tags: - PyTorch - Transformers pipeline_tag: text-classification --- ## Japanese Stock Comment Sentiment Model This model is a sentiment analysis tool specifically trained to analyze comments and discussions related to Japanese stocks. It is specialized in determining whether a comment has a bearish or bullish sentiment. For its training, a large collection of individual stock-related comments was gathered, and these were categorized into two main categories: "bullish" and "bearish." This model can serve as a supportive tool for stock investors and market analysts in gathering information and making prompt decisions. ## How to use ### Part 1: Model Initialization In this section, we'll be initializing the necessary components required for our prediction: the model and the tokenizer. ```python import torch from transformers import AutoModelForSequenceClassification, AutoTokenizer # Load the model and tokenizer model_path = "c299m/japanese_stock_sentiment" device = torch.device("cuda" if torch.cuda.is_available() else "cpu") tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForSequenceClassification.from_pretrained(model_path).to(device) ``` ### Part 2: Text Prediction Once our model and tokenizer are initialized, we can move on to predicting the sentiment of a given text. The sentiment is classified into two categories: "bullish" (positive sentiment) or "bearish" (negative sentiment). ```python import numpy as np import torch.nn.functional as F # Text for inference sample_text = "\ 材料良すぎてストップ安、、助けてクレステック、、、\ " # Tokenize the text inputs = tokenizer(sample_text, return_tensors="pt") # Set the model to evaluation mode model.eval() # Execute the inference with torch.no_grad(): outputs = model( inputs["input_ids"].to(device), attention_mask=inputs["attention_mask"].to(device), ) # Obtain logits and apply softmax function to convert to probabilities probabilities = F.softmax(outputs.logits, dim=1).cpu().numpy() # Get the index of the class with the highest probability y_preds = np.argmax(probabilities, axis=1) # Convert the index to a label def id2label(x): return model.config.id2label[x] y_dash = [id2label(x) for x in y_preds] # Get the probability of the most likely class top_probs = probabilities[np.arange(len(y_preds)), y_preds] print(y_dash, top_probs) ```