# tinyroberta-mrqa This is the *distilled* version of the [VMware/roberta-large-mrqa](https://huggingface.co/VMware/roberta-large-mrqa) model. This model has a comparable prediction quality to the base model and runs twice as fast. ## Overview **Language model:** tinyroberta-mrqa **Language:** English **Downstream-task:** Extractive QA **Training data:** MRQA **Eval data:** MRQA ## Hyperparameters ### Distillation Hyperparameters ``` batch_size = 96 n_epochs = 4 base_LM_model = "deepset/tinyroberta-squad2-step1" max_seq_len = 384 learning_rate = 3e-5 lr_schedule = LinearWarmup warmup_proportion = 0.2 doc_stride = 128 max_query_length = 64 distillation_loss_weight = 0.75 temperature = 1.5 teacher = "VMware/roberta-large-mrqa" ``` ### Finetunning Hyperparameters We have finetuned on the MRQA training set. ``` learning_rate=1e-5, num_train_epochs=3, weight_decay=0.01, per_device_train_batch_size=16, n_gpus = 3 ``` ## Distillation This model is inspired by deepset/tinyroberta-squad2. We start with a base checkpoint of [deepset/roberta-base-squad2](https://huggingface.co/deepset/roberta-base-squad2) and perform further task prediction layer distillation on [VMware/roberta-large-mrqa](https://huggingface.co/VMware/roberta-large-mrqa). We then fine-tune it on MRQA. ## Usage ### In Transformers ```python from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline model_name = "VMware/tinyroberta-mrqa" # a) Get predictions nlp = pipeline('question-answering', model=model_name, tokenizer=model_name) QA_input = { 'question': '', 'context': '' } res = nlp(QA_input) # b) Load model & tokenizer model = AutoModelForQuestionAnswering.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) ``` ## Performance We have Evaluated the model on the MRQA dev set and test set using SQUAD metrics. ``` eval exact match: 69.2 eval f1 score: 79.6 test exact match: 52.8 test f1 score: 63.4 ```