sadhaklal's picture
made a minor cosmetic change to the "Dataset" section
61fb594 verified
metadata
license: apache-2.0
datasets:
  - sst2
language:
  - en
metrics:
  - accuracy
library_name: transformers
pipeline_tag: text-classification
widget:
  - text: >-
      this film 's relationship to actual tension is the same as what
      christmas-tree flocking in a spray can is to actual snow : a poor -- if
      durable -- imitation .
    example_title: negative
  - text: director rob marshall went out gunning to make a great one .
    example_title: positive

bert-base-uncased-finetuned-sst2-v2

BERT ("bert-base-uncased") finetuned on SST-2 (Stanford Sentiment Treebank Binary).

This model pertains to the "Try it out!" exercise in section 4 of chapter 3 of the Hugging Face "NLP Course" (https://huggingface.co/learn/nlp-course/chapter3/4).

It was trained using a custom PyTorch loop without Hugging Face Accelerate.

Code: https://github.com/sambitmukherjee/hf-nlp-course-exercises/blob/main/chapter3/section4.ipynb

Experiment tracking: https://wandb.ai/sadhaklal/bert-base-uncased-finetuned-sst2-v2

Usage

from transformers import pipeline

classifier = pipeline("text-classification", model="sadhaklal/bert-base-uncased-finetuned-sst2-v2")
print(classifier("uneasy mishmash of styles and genres ."))
print(classifier("by the end of no such thing the audience , like beatrice , has a watchful affection for the monster ."))

Dataset

From the dataset page:

The Stanford Sentiment Treebank is a corpus with fully labeled parse trees that allows for a complete analysis of the compositional effects of sentiment in language...

Binary classification experiments on full sentences (negative or somewhat negative vs somewhat positive or positive with neutral sentences discarded) refer to the dataset as SST-2 or SST binary.

Examples: https://huggingface.co/datasets/sst2/viewer

Metric

Accuracy on the 'validation' split of SST-2: 0.9278