|
--- |
|
license: gpl-3.0 |
|
--- |
|
|
|
# reviewBERT-base |
|
|
|
This model is a fine-tuned version of [`bert-base-uncased`](https://huggingface.co/google-bert/bert-base-uncased) on a large dataset |
|
of mobile app reviews. The model is designed to understand and process text from mobile app reviews, providing enhanced performance |
|
for tasks such as feature extraction, sentiment analysis and review summarization from app reviews. |
|
|
|
## Model Details |
|
|
|
- **Model Architecture**: BERT (Bidirectional Encoder Representations from Transformers) |
|
- **Base Model**: `bert-base-uncased` |
|
- **Pre-training Extension**: Mobile app reviews dataset |
|
- **Language**: English |
|
|
|
## Dataset |
|
|
|
The extended pre-training was performed using a diverse dataset of mobile app reviews collected from various app stores. |
|
The dataset includes reviews of different lengths, sentiments, and topics, providing a robust foundation for understanding |
|
the nuances of mobile app user feedback. |
|
|
|
## Training Procedure |
|
|
|
The model was fine-tuned using the following parameters: |
|
|
|
- **Batch Size**: 16 |
|
- **Learning Rate**: 2e-5 |
|
- **Epochs**: 2 |
|
|
|
## Usage |
|
|
|
### Load the model |
|
|
|
```python |
|
from transformers import BertTokenizer, BertForSequenceClassification |
|
|
|
tokenizer = BertTokenizer.from_pretrained('quim-motger/reviewBERT-base') |
|
model = BertForSequenceClassification.from_pretrained('quim-motger/reviewBERT-base') |
|
``` |
|
|
|
### Example: Sentiment Analysis |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
nlp = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer) |
|
|
|
review = "This app is fantastic! I love the user-friendly interface and features." |
|
result = nlp(review) |
|
|
|
print(result) |
|
# Output: [{'label': 'POSITIVE', 'score': 0.98}] |
|
``` |
|
|
|
### Example: Review Summarization |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
summarizer = pipeline('summarization', model=model, tokenizer=tokenizer) |
|
|
|
long_review = "I have been using this app for a while and it has significantly improved my productivity. |
|
The range of features is excellent, and the user interface is intuitive. However, there are occasional |
|
bugs that need fixing." |
|
summary = summarizer(long_review, max_length=50, min_length=25, do_sample=False) |
|
|
|
print(summary) |
|
# Output: [{'summary_text': 'The app has significantly improved my productivity with its excellent features and intuitive user interface. However, occasional bugs need fixing.'}] |
|
``` |
|
|