racro's picture
added README
1b0dbc9
|
raw
history blame
2.04 kB
metadata
license: apache-2.0
base_model: distilbert-base-uncased
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - f1
model-index:
  - name: sentiment-analysis-browser-extension
    results: []
language:
  - en

Fine-tuned BERT model

We open source this fine-tuned BERT model to identify critical aspects within user reviews of adblocking extensions. For every user review, the model provides a criticality score (in the range of -1 to 1) with the negative scores signifying higher probability of finding critical topics within in the reviews.

We have used the distilbert-base-uncased as the base model and fine-tuned it on a manually annotated dataset of webstore reviews.

Further details can be found in our AsiaCCS paper - From User Insights to Actionable Metrics: A User-Focused Evaluation of Privacy-Preserving Browser Extensions.

Note

We haven't tested its accuracy on user reviews from other categories but are open to discuss the possibility of extrapolating it to other product categories.

Intended uses & limitations

The model has been released to be freely used. It has not been trained on any private user data. Please do cite the above paper in our works.

Evaluation data

It achieves the following results on the evaluation set:

  • Loss: 0.4768
  • Accuracy: 0.8615
  • F1: 0.8816

Training procedure

The training dataset comprised on 620 reviews and the test dataset had 150 reviews. The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 6

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.14.1