subho commited on
Commit
fbedb81
1 Parent(s): 79d89b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -12,6 +12,8 @@ XtremeDistil is a distilled task-agnostic transformer model leveraging multi-tas
12
 
13
  This l6-h384 checkpoint with **6** layers, **384** hidden size, **12** attention heads corresponds to **22 million** parameters with **5.3x** speedup over BERT-base.
14
 
 
 
15
  The following table shows the results on GLUE dev set and SQuAD-v2.
16
 
17
  | Models | #Params | Speedup | MNLI | QNLI | QQP | RTE | SST | MRPC | SQUAD2 | Avg |
 
12
 
13
  This l6-h384 checkpoint with **6** layers, **384** hidden size, **12** attention heads corresponds to **22 million** parameters with **5.3x** speedup over BERT-base.
14
 
15
+ Other available checkpoints: [xtremedistil-l6-h256-uncased](https://huggingface.co/microsoft/xtremedistil-l6-h256-uncased) and [xtremedistil-l6-h384-uncased](https://huggingface.co/microsoft/xtremedistil-l6-h384-uncased)
16
+
17
  The following table shows the results on GLUE dev set and SQuAD-v2.
18
 
19
  | Models | #Params | Speedup | MNLI | QNLI | QQP | RTE | SST | MRPC | SQUAD2 | Avg |