distilbert-imdb / README.md
3oclock's picture
Update README.md
93d7843 verified
---
library_name: transformers
datasets:
- stanfordnlp/imdb
metrics:
- accuracy
tags:
- PyTorch
model-index:
- name: distilbert-imdb
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: imdb
type: imdb
args: plain_text
metrics:
- name: Accuracy
type: accuracy
value: 0.9316
pipeline_tag: text-classification
license: apache-2.0
language:
- en
---
# distilbert-imdb
This is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on imdb dataset.
## Performance
- Loss: 0.1958
- Accuracy: 0.932
## How to Get Started with the Model
Use the code below to get started with the model:
```python
from transformers import pipeline,DistilBertTokenizer
tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased")
classifier = pipeline("sentiment-analysis", model="3oclock/distilbert-imdb", tokenizer=tokenizer)
result = classifier("I love this movie!")
print(result)
```
## Model Details
### Model Description
This is the model card for a fine-tuned 🤗 transformers model on the IMDb dataset.
- **Developed by:** Ge Li
- **Model type:** DistilBERT for Sequence Classification
- **Language(s) (NLP):** English
- **License:** [Specify License, e.g., Apache 2.0]
- **Finetuned from model:** `distilbert-base-uncased`
## Uses
### Direct Use
This model can be used directly for sentiment analysis on movie reviews. It is best suited for classifying English-language text that is similar in nature to movie reviews.
### Downstream Use [optional]
This model can be fine-tuned on other sentiment analysis tasks or adapted for tasks like text classification in domains similar to IMDb movie reviews.
### Out-of-Scope Use
The model may not perform well on non-English text or text that is significantly different in style and content from the IMDb dataset (e.g., technical documents, social media posts).
## Bias, Risks, and Limitations
### Bias
The IMDb dataset primarily consists of English-language movie reviews and may not generalize well to other languages or types of reviews.
### Risks
Misclassification in sentiment analysis can lead to incorrect conclusions in applications relying on this model.
### Limitations
The model was trained on a dataset of movie reviews, so it may not perform as well on other types of text data.
### Recommendations
Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model.