|
--- |
|
widget: |
|
- context: While deep and large pre-trained models are the state-of-the-art for various |
|
natural language processing tasks, their huge size poses significant challenges |
|
for practical uses in resource constrained settings. Recent works in knowledge |
|
distillation propose task-agnostic as well as task-specific methods to compress |
|
these models, with task-specific ones often yielding higher compression rate. |
|
In this work, we develop a new task-agnostic distillation framework XtremeDistilTransformers |
|
that leverages the advantage of task-specific methods for learning a small universal |
|
model that can be applied to arbitrary tasks and languages. To this end, we study |
|
the transferability of several source tasks, augmentation resources and model |
|
architecture for distillation. We evaluate our model performance on multiple tasks, |
|
including the General Language Understanding Evaluation (GLUE) benchmark, SQuAD |
|
question answering dataset and a massive multi-lingual NER dataset with 41 languages. |
|
example_title: xtremedistil q1 |
|
text: What is XtremeDistil? |
|
- context: While deep and large pre-trained models are the state-of-the-art for various |
|
natural language processing tasks, their huge size poses significant challenges |
|
for practical uses in resource constrained settings. Recent works in knowledge |
|
distillation propose task-agnostic as well as task-specific methods to compress |
|
these models, with task-specific ones often yielding higher compression rate. |
|
In this work, we develop a new task-agnostic distillation framework XtremeDistilTransformers |
|
that leverages the advantage of task-specific methods for learning a small universal |
|
model that can be applied to arbitrary tasks and languages. To this end, we study |
|
the transferability of several source tasks, augmentation resources and model |
|
architecture for distillation. We evaluate our model performance on multiple tasks, |
|
including the General Language Understanding Evaluation (GLUE) benchmark, SQuAD |
|
question answering dataset and a massive multi-lingual NER dataset with 41 languages. |
|
example_title: xtremedistil q2 |
|
text: On what is the model validated? |
|
datasets: |
|
- squad_v2 |
|
metrics: |
|
- f1 |
|
- exact |
|
tags: |
|
- question-answering |
|
model-index: |
|
- name: nbroad/xdistil-l12-h384-squad2 |
|
results: |
|
- task: |
|
type: question-answering |
|
name: Question Answering |
|
dataset: |
|
name: squad_v2 |
|
type: squad_v2 |
|
config: squad_v2 |
|
split: validation |
|
metrics: |
|
- name: Exact Match |
|
type: exact_match |
|
value: 75.4591 |
|
verified: true |
|
- name: F1 |
|
type: f1 |
|
value: 79.3321 |
|
verified: true |
|
- task: |
|
type: question-answering |
|
name: Question Answering |
|
dataset: |
|
name: squad |
|
type: squad |
|
config: plain_text |
|
split: validation |
|
metrics: |
|
- name: Exact Match |
|
type: exact_match |
|
value: 81.8604 |
|
verified: true |
|
- name: F1 |
|
type: f1 |
|
value: 89.6654 |
|
verified: true |
|
--- |
|
|
|
xtremedistil-l12-h384 trained on SQuAD 2.0 |
|
|
|
"eval_exact": 75.45691906005221 |
|
"eval_f1": 79.32502968532793 |