File size: 3,160 Bytes

---
widget:
- context: While deep and large pre-trained models are the state-of-the-art for various
    natural language processing tasks, their huge size poses significant challenges
    for practical uses in resource constrained settings. Recent works in knowledge
    distillation propose task-agnostic as well as task-specific methods to compress
    these models, with task-specific ones often yielding higher compression rate.
    In this work, we develop a new task-agnostic distillation framework XtremeDistilTransformers
    that leverages the advantage of task-specific methods for learning a small universal
    model that can be applied to arbitrary tasks and languages. To this end, we study
    the transferability of several source tasks, augmentation resources and model
    architecture for distillation. We evaluate our model performance on multiple tasks,
    including the General Language Understanding Evaluation (GLUE) benchmark, SQuAD
    question answering dataset and a massive multi-lingual NER dataset with 41 languages.
  example_title: xtremedistil q1
  text: What is XtremeDistil?
- context: While deep and large pre-trained models are the state-of-the-art for various
    natural language processing tasks, their huge size poses significant challenges
    for practical uses in resource constrained settings. Recent works in knowledge
    distillation propose task-agnostic as well as task-specific methods to compress
    these models, with task-specific ones often yielding higher compression rate.
    In this work, we develop a new task-agnostic distillation framework XtremeDistilTransformers
    that leverages the advantage of task-specific methods for learning a small universal
    model that can be applied to arbitrary tasks and languages. To this end, we study
    the transferability of several source tasks, augmentation resources and model
    architecture for distillation. We evaluate our model performance on multiple tasks,
    including the General Language Understanding Evaluation (GLUE) benchmark, SQuAD
    question answering dataset and a massive multi-lingual NER dataset with 41 languages.
  example_title: xtremedistil q2
  text: On what is the model validated?
datasets:
- squad_v2
metrics:
- f1
- exact
tags:
- question-answering
model-index:
- name: nbroad/xdistil-l12-h384-squad2
  results:
  - task:
      type: question-answering
      name: Question Answering
    dataset:
      name: squad_v2
      type: squad_v2
      config: squad_v2
      split: validation
    metrics:
    - name: Exact Match
      type: exact_match
      value: 75.4591
      verified: true
    - name: F1
      type: f1
      value: 79.3321
      verified: true
  - task:
      type: question-answering
      name: Question Answering
    dataset:
      name: squad
      type: squad
      config: plain_text
      split: validation
    metrics:
    - name: Exact Match
      type: exact_match
      value: 81.8604
      verified: true
    - name: F1
      type: f1
      value: 89.6654
      verified: true
---

xtremedistil-l12-h384 trained on SQuAD 2.0

"eval_exact": 75.45691906005221  
"eval_f1": 79.32502968532793