--- language: - en datasets: - imdb metrics: - accuracy --- # bert-imdb-1hidden ## Model description A `bert-base-uncased` model was restricted to 1 hidden layer and fine-tuned for sequence classification on the imdb dataset loaded using the `datasets` library. ## Intended uses & limitations #### How to use ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification pretrained = "lannelin/bert-imdb-1hidden" tokenizer = AutoTokenizer.from_pretrained(pretrained) model = AutoModelForSequenceClassification.from_pretrained(pretrained) LABELS = ["negative", "positive"] def get_sentiment(text: str): inputs = tokenizer.encode_plus(text, return_tensors='pt') output = model(**inputs)[0].squeeze() return LABELS[(output.argmax())] print(get_sentiment("What a terrible film!")) ``` #### Limitations and bias No special consideration given to limitations and bias. Any bias held by the imdb dataset may be reflected in the model's output. ## Training data Initialised with [bert-base-uncased](https://huggingface.co/bert-base-uncased) Fine tuned on [imdb](https://huggingface.co/datasets/imdb) ## Training procedure The model was fine-tuned for 1 epoch with a batch size of 64, a learning rate of 5e-5, and a maximum sequence length of 512. ## Eval results Accuracy on imdb test set: 0.87132