|
--- |
|
language: en |
|
tags: |
|
- distilbert |
|
- needmining |
|
license: apache-2.0 |
|
metric: |
|
- f1 |
|
--- |
|
|
|
# Finetuned-Distilbert-needmining (uncased) |
|
|
|
This model is a finetuned version of the [Distilbert base model](https://huggingface.co/distilbert-base-uncased). It was |
|
trained to predict need-containing sentences from amazon product reviews. |
|
|
|
## Model description |
|
|
|
This mode is part of ongoing research, after the publication of the research more information will be added. |
|
|
|
## Intended uses & limitations |
|
|
|
You can use this model to identify sentences that contain customer needs in user-generated content. This can act as a filtering process to remove uninformative content for market research. |
|
|
|
### How to use |
|
|
|
You can use this model directly with a pipeline for text classification: |
|
|
|
```python |
|
>>> from transformers import pipeline |
|
>>> classifier = pipeline("text-classification", model="svenstahlmann/finetuned-distilbert-needmining") |
|
>>> classifier("the plasic feels super cheap.") |
|
[{'label': 'contains need', 'score': 0.9397542476654053}] |
|
``` |
|
|
|
### Limitations and bias |
|
|
|
We are not aware of any bias in the training data. |
|
|
|
## Training data |
|
|
|
The training was done on a dataset of 6400 sentences. The sentences were taken from product reviews off amazon and coded if they express customer needs. |
|
## Training procedure |
|
For the training, we used [Population Based Training (PBT)](https://www.deepmind.com/blog/population-based-training-of-neural-networks) and optimized for f1 score on a validation set of 1600 sentences. |
|
|
|
|
|
### Preprocessing |
|
|
|
The preprocessing follows the [Distilbert base model](https://huggingface.co/distilbert-base-uncased). |
|
|
|
|
|
### Pretraining |
|
|
|
The model was trained on a titan RTX for 1 hour. |
|
|
|
## Evaluation results |
|
|
|
Results on the validation set: |
|
|
|
| F1 | |
|
|:----:| |
|
| 76.0 | |
|
|
|
|
|
### BibTeX entry and citation info |
|
|
|
coming soon |