---
license: apache-2.0
base_model: distilbert-base-uncased
tags:
- generated_from_trainer
metrics:
- accuracy
- f1
model-index:
- name: sentiment-analysis-browser-extension
  results: []
language:
- en
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Fine-tuned BERT model
We open source this fine-tuned BERT model to identify critical aspects within user reviews of adblocking extensions. For every user review, the model provides a criticality score (in the range of -1 to 1) with the negative scores signifying higher probability of finding critical topics within in the reviews.

We have used the [`distilbert-base-uncased`](https://huggingface.co/distilbert-base-uncased) as the base model and fine-tuned it on a manually annotated dataset of webstore reviews.

Further details can be found in our AsiaCCS paper - [From User Insights to Actionable Metrics: A User-Focused Evaluation of Privacy-Preserving Browser Extensions](https://doi.org/10.1145/3634737.3657028).

### Note
We haven't tested its accuracy on user reviews from other categories but are open to discuss the possibility of extrapolating it to other product categories.

## Intended uses & limitations
The model has been released to be freely used. It has not been trained on any private user data. Please do cite the above paper in our works. 

## Evaluation data

It achieves the following results on the evaluation set:
- Loss: 0.4768
- Accuracy: 0.8615
- F1: 0.8816

## Training procedure

The training dataset comprised on 620 reviews and the test dataset had 150 reviews. The following hyperparameters were used during training:

- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 6

### Framework versions

- Transformers 4.34.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.14.1