Back to all models
text-classification mask_token: [MASK]
Query this model
πŸ”₯ This model is currently loaded and running on the Inference API. ⚠️ This model could not be loaded by the inference API. ⚠️ This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint  

⚑️ Upgrade your account to access the Inference API

Share Copied link to clipboard

Monthly model downloads

mrm8488/bert-uncased-finetuned-qnli mrm8488/bert-uncased-finetuned-qnli
last 30 days



Contributed by

mrm8488 Manuel Romero
156 models

How to use this model directly from the πŸ€—/transformers library:

Copy to clipboard
from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("mrm8488/bert-uncased-finetuned-qnli") model = AutoModelForSequenceClassification.from_pretrained("mrm8488/bert-uncased-finetuned-qnli")

BERT fine tuned on QNLI+ compression (BERT-of-Theseus)

I used a Bert model fine tuned on SQUAD v2 and then I fine tuned it on QNLI using compression (with a constant replacing rate) as proposed in BERT-of-Theseus

Details of the downstream task (QNLI):

Getting the dataset


mkdir QNLI_dataset
mv *.tsv QNLI_dataset

Model training

The model was trained on a Tesla P100 GPU and 25GB of RAM with the following command:

!python /content/BERT-of-Theseus/ \
  --model_name_or_path deepset/bert-base-cased-squad2 \
  --task_name qnli \
  --do_train \
  --do_eval \
  --do_lower_case \
  --data_dir /content/QNLI_dataset \
  --max_seq_length 128 \
  --per_gpu_train_batch_size 32 \
  --per_gpu_eval_batch_size 32 \
  --learning_rate 2e-5 \
  --save_steps 2000 \
  --num_train_epochs 50 \
  --output_dir /content/ouput_dir \
  --evaluate_during_training \
  --replacing_rate 0.7 \
  --steps_for_replacing 2500 


Model Accuracy
BERT-base 91.2
BERT-of-Theseus 88.8
bert-uncased-finetuned-qnli 87.2
DistillBERT 85.3

See all my models

Created by Manuel Romero/@mrm8488

Made with in Spain