YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Korean Sentiment Analysis with BERT

This project aims to perform sentiment analysis on Korean text using a pre-trained BERT model. The model has been fine-tuned on a sentiment analysis dataset to classify text into positive and negative sentiment categories.

Fine-tuning

The model is fine-tuned using a custom dataset with the following configuration:

  • Number of Labels: 2 (positive and negative)
  • Training Epochs: 1
  • Batch Size: 20
  • Optimizer: AdamW with weight decay

Dataset

The dataset used for fine-tuning the model consists of Korean text samples labeled with sentiment categories. The reviews in the dataset are scraped from the Google Play Store for the Kakao app. The dataset is split into three parts:

  • Training Set: Used to train the model.
  • Validation Set: Used to evaluate the model during training and tune hyperparameters.
  • Test Set: Used to evaluate the final performance of the model.

Data Preparation

The text data is tokenized using BertTokenizerFast with truncation and padding to ensure uniform input lengths.

Evaluation

The model is evaluated on the train, validation, and test sets using accuracy, F1 score, precision, and recall as metrics. Below are the results of the evaluation:

Evaluation Results

Set loss Accuracy F1 Precision Recall
Train 0.097011 0.967398 0.967397 0.967405 0.967398
Val 0.162700 0.945322 0.945321 0.945328 0.945322
Test 0.145638 0.948864 0.948864 0.948864 0.948864

license: apache-2.0

Downloads last month
143
Safetensors
Model size
178M params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Space using Dilwolf/Kakao_app-kr_sentiment 1