GLUE-leaderboard / leaderboard.tsv
bdx33's picture
Update leaderboard.tsv
05a5451 verified
Rank Name Model URL Score CoLA SST-2 MRPC STS-B QQP MNLI-m MNLI-mm QNLI RTE WNLI AX
1 Microsoft Alexander v-team Turing ULR v6 91.3 73.3 97.5 94.2/92.3 93.5/93.1 76.4/90.9 92.5 92.1 96.7 93.6 97.9 55.4
2 JDExplore d-team Vega v1 91.3 73.8 97.9 94.5/92.6 93.5/93.1 76.7/91.1 92.1 91.9 96.7 92.4 97.9 51.4
3 Microsoft Alexander v-team Turing NLR v5 91.2 72.6 97.6 93.8/91.7 93.7/93.3 76.4/91.1 92.6 92.4 97.9 94.1 95.9 57.0
4 DIRL Team DeBERTa + CLEVER 91.1 74.7 97.6 93.3/91.1 93.4/93.1 76.5/91.0 92.1 91.8 96.7 93.2 96.6 53.3
5 ERNIE Team - Baidu ERNIE 91.1 75.5 97.8 93.9/91.8 93.0/92.6 75.2/90.9 92.3 91.7 97.3 92.6 95.9 51.7
6 AliceMind & DIRL StructBERT + CLEVER 91.0 75.3 97.7 93.9/91.9 93.5/93.1 75.6/90.8 91.7 91.5 97.4 92.5 95.2 49.1
7 DeBERTa Team - Microsoft DeBERTa / TuringNLRv4 90.8 71.5 97.5 94.0/92.0 92.9/92.6 76.2/90.8 91.9 91.6 99.2 93.2 94.5 53.2
8 HFL iFLYTEK MacALBERT + DKM 90.7 74.8 97.0 94.5/92.6 92.8/92.6 74.7/90.6 91.3 91.1 97.8 92.0 94.5 52.6
9 PING-AN Omni-Sinitic ALBERT + DAAF + NAS 90.6 73.5 97.2 94.0/92.0 93.0/92.4 76.1/91.0 91.6 91.3 97.5 91.7 94.5 51.2
10 T5 Team - Google T5 90.3 71.6 97.5 92.8/90.4 93.1/92.8 75.1/90.6 92.2 91.9 96.9 92.8 94.5 53.1
11 Microsoft D365 AI & MSR AI & GATECH MT-DNN-SMART 89.9 69.5 97.5 93.7/91.6 92.9/92.5 73.9/90.2 91.0 90.8 99.2 89.7 94.5 50.2
12 Huawei Noah's Ark Lab NEZHA-Large 89.8 71.7 97.3 93.3/91.0 92.4/91.9 75.2/90.7 91.5 91.3 96.2 90.3 94.5 47.9
13 LG AI Research ANNA 89.8 68.7 97.0 92.7/90.1 93.0/92.8 75.3/90.5 91.8 91.6 96.0 91.8 95.9 51.8
14 Zihang Dai Funnel-Transformer (Ensemble B10-10-10H1024) 89.7 70.5 97.5 93.4/91.2 92.6/92.3 75.4/90.7 91.4 91.1 95.8 90.0 94.5 51.6
15 ELECTRA Team ELECTRA-Large + Standard Tricks 89.4 71.7 97.1 93.1/90.7 92.9/92.5 75.6/90.8 91.3 90.8 95.8 89.8 91.8 50.7
16 David Kim 2digit LANet 89.3 71.8 97.3 92.4/89.6 93.0/92.7 75.5/90.5 91.8 91.6 96.4 91.1 88.4 54.6
17 ε€ͺδ»•ζ–‡ DropAttack-RoBERTa-large 88.8 70.3 96.7 92.6/90.1 92.1/91.8 75.1/90.5 91.1 90.9 95.3 89.9 89.7 48.2
18 Microsoft D365 AI & UMD FreeLB-RoBERTa (ensemble) 88.4 68.0 96.8 93.1/90.8 92.3/92.1 74.8/90.3 91.1 90.7 95.6 88.7 89.0 50.1
19 Junjie Yang HIRE-RoBERTa 88.3 68.6 97.1 93.0/90.7 92.4/92.0 74.3/90.2 90.7 90.4 95.5 87.9 89.0 49.3
20 Shiwen Ni ELECTRA-large-M (bert4keras) 88.3 69.3 95.8 92.2/89.6 91.2/91.1 75.1/90.5 91.1 90.9 93.8 87.9 91.8 48.2
21 Facebook AI RoBERTa 88.1 67.8 96.7 92.3/89.8 92.2/91.9 74.3/90.2 90.8 90.2 95.4 88.2 89.0 48.7
22 Microsoft D365 AI & MSR AI MT-DNN-ensemble 87.6 68.4 96.5 92.7/90.3 91.1/90.7 73.7/89.9 87.9 87.4 96.0 86.3 89.0 42.8
23 GLUE Human Baselines GLUE Human Baselines 87.1 66.4 97.8 86.3/80.8 92.7/92.6 59.5/80.4 92.0 92.8 91.2 93.6 95.9 -
24 kk xx ELECTRA-Large-NewSCL(single) 85.6 73.3 97.2 92.7/90.2 92.0/91.7 75.3/90.6 90.8 90.3 95.6 86.9 60.3 50.0
25 Adrian de Wynter Bort (Alexa AI) 83.6 63.9 96.2 94.1/92.3 89.2/88.3 66.0/85.9 88.1 87.8 92.3 82.7 71.2 51.9
26 Lab LV ConvBERT base 83.2 67.8 95.7 91.4/88.3 90.4/89.7 73.0/90.0 88.3 87.4 93.2 77.9 65.1 42.9
27 Stanford Hazy Research Snorkel MeTaL 83.2 63.8 96.2 91.5/88.5 90.1/89.7 73.1/89.9 87.6 87.2 93.9 80.9 65.1 39.9
28 XLM Systems XLM (English only) 83.1 62.9 95.6 90.7/87.1 88.8/88.2 73.2/89.8 89.1 88.5 94.0 76.0 71.9 44.7
29 WATCH ME ConvBERT-base-paddle-v1.1 83.1 66.3 95.4 91.6/88.6 90.0/89.2 73.9/90.0 88.2 87.7 93.3 78.2 65.1 9.2
30 Zhuosheng Zhang SemBERT 82.9 62.3 94.6 91.2/88.3 87.8/86.7 72.8/89.8 87.6 86.3 94.6 84.5 65.1 42.4
31 Jun Yu mpnet-base-paddle 82.9 60.5 95.9 91.6/88.9 90.8/90.3 72.5/89.7 87.6 86.6 93.3 82.4 65.1 9.2
32 Danqi Chen SpanBERT (single-task training) 82.8 64.3 94.8 90.9/87.9 89.9/89.1 71.9/89.5 88.1 87.7 94.3 79.0 65.1 45.1
33 GAL team distilRoBERTa+GAL (6-layer transformer single model) 82.6 60.0 95.3 91.9/89.2 90.0/89.6 73.3/90.0 87.4 86.5 92.7 81.8 65.1 0.0
34 Kevin Clark BERT + BAM 82.3 61.5 95.2 91.3/88.3 88.6/87.9 72.5/89.7 86.6 85.8 93.1 80.4 65.1 40.7
35 Nitish Shirish Keskar Span-Extractive BERT on STILTs 82.3 63.2 94.5 90.6/87.6 89.4/89.2 72.2/89.4 86.5 85.8 92.5 79.8 65.1 28.3
36 LV NUS LV-BERT-base 82.1 64.0 94.7 90.9/87.9 89.4/88.8 72.3/89.5 86.6 86.1 92.6 77.0 65.1 39.5
37 Jason Phang BERT on STILTs 82.0 62.1 94.3 90.2/86.6 88.7/88.3 71.9/89.4 86.4 85.6 92.7 80.1 65.1 28.3
38 gao jie 1 82.0 66.8 96.5 90.9/87.2 91.4/90.8 72.9/89.6 90.2 56.4 94.7 82.8 62.3 9.2
39 Gino Tesei RobustRoBERTa 81.9 63.6 96.8 91.6/88.6 90.3/89.6 73.2/89.7 90.0 89.4 95.1 50.3 80.1 50.5
40 Karen Hambardzumyan WARP with RoBERTa 81.6 53.9 96.3 88.2/83.9 89.5/88.8 68.6/87.7 88.0 88.2 93.5 84.3 65.1 41.2
41 Junxiong Wang Bigs-128-1000k 81.5 64.4 94.9 88.7/84.2 87.8/87.5 71.2/89.2 86.1 85.0 91.6 77.6 65.1 36.2
42 Huawei Noah's Ark Lab MTL CombinedKD-TinyRoBERTa (6 layer 82M parameters, MATE-KD + AnnealingKD) 81.5 58.6 95.1 91.2/88.1 88.5/88.4 73.0/89.7 86.2 85.6 92.4 76.6 65.1 20.2
43 Richard Bai segaBERT-large 81.4 62.6 94.8 89.7/86.1 88.6/87.7 72.5/89.4 87.9 87.7 94.0 71.6 65.1 0.0
44 ε»–δΊΏ u-PMLM-R (Huawei Noah's Ark Lab) 81.3 56.9 94.2 90.7/87.7 89.7/89.1 72.2/89.4 86.1 85.4 92.1 78.5 65.1 40.0
45 Xinsong Zhang AMBERT-BASE 81.0 60.0 95.2 90.6/87.1 86.3/88.2 72.2/89.5 87.2 86.5 92.6 72.6 65.1 39.4
46 Mikita Sazanovich Routed BERTs 80.7 56.1 93.6 88.6/84.7 88.0/87.6 71.0/88.8 85.2 84.5 92.6 80.0 65.1 9.2
47 USCD-AI4Health Team CERT 80.7 58.9 94.6 89.8/85.9 87.9/86.8 72.5/90.3 87.2 86.4 93.0 71.2 65.1 39.6
48 Jacob Devlin BERT: 24-layers, 16-heads, 1024-hidden 80.5 60.5 94.9 89.3/85.4 87.6/86.5 72.1/89.3 86.7 85.9 92.7 70.1 65.1 39.6
49 Chen Qian KerasNLP XLM-R 80.4 56.3 96.1 89.8/86.3 88.4/87.7 72.3/89.0 87.7 87.1 92.8 69.2 65.1 40.6
50 Chen Qian KerasNLP RoBERTa 80.4 56.3 96.1 89.8/86.3 88.4/87.7 72.3/89.0 87.7 87.1 92.8 69.2 65.1 40.6
51 Jinliang LU MULTIPLE_ADAPTER_T5_BASE 80.3 54.1 93.8 90.1/86.8 87.9/87.6 71.8/88.9 86.1 85.7 93.5 76.8 62.3 9.2
52 Yoshitomo Matsubara HF bert-large-uncased (default fine-tuning) 80.2 61.5 94.6 89.2/85.2 86.4/85.0 72.2/89.3 86.4 85.7 92.4 68.9 65.1 36.9
53 Neil Houlsby BERT + Single-task Adapters 80.2 59.2 94.3 88.7/84.3 87.3/86.1 71.5/89.4 85.4 85.0 92.4 71.6 65.1 9.2
54 KI BERT KI-BERT 80.0 55.6 94.5 88.2/83.9 86.3/85.1 71.5/88.9 85.2 83.7 91.2 69.3 73.3 35.6
55 Xiangyang Liu elasticbert-large-12L 79.9 57.0 92.9 89.4/86.0 89.7/88.6 72.7/89.6 85.4 84.9 92.3 71.8 62.3 9.2
56 εˆ˜ε‘ι˜³ roberta-large-12L 79.8 59.4 94.6 89.1/85.8 89.8/89.1 71.5/89.4 86.4 85.2 91.6 67.3 62.3 9.2
57 Zhuohan Li Macaron Net-base 79.7 57.6 94.0 88.4/84.4 87.5/86.3 70.8/89.0 85.4 84.5 91.6 70.5 65.1 38.7
58 shi To GAT-bert-base 79.6 56.8 94.0 89.4/85.3 87.9/86.8 72.4/89.4 85.7 84.5 91.8 70.5 62.3 9.2
59 teerapong saelim WT-VAT-BERT (Base) 79.5 56.0 94.4 89.2/85.5 87.3/86.2 72.9/89.8 85.5 84.8 91.4 70.4 62.3 9.2
60 ζŽζ€θΎ‰ FD-CMIC+CM-MIC best MNLI-mm 79.3 51.1 93.1 89.7/86.1 87.5/86.6 71.6/89.3 84.2 84.1 91.2 73.6 65.1 0.0
61 eye Gavin zzz 79.1 51.4 92.4 89.7/86.1 87.9/87.0 71.1/89.2 83.6 83.0 91.1 73.6 65.1 -4.5
62 Anshuman Singh Bert-n-Pals 79.1 52.2 93.4 89.5/85.6 86.6/85.9 71.4/89.0 84.1 83.5 90.6 75.4 62.3 33.8
63 ANSHUMAN SINGH (RA1811003010460) DeepPavlov Multitask PalBert 78.8 48.1 93.4 88.9/85.6 87.0/86.7 71.4/89.0 83.9 83.4 90.8 76.7 62.3 33.8
64 xiaok Liu BERT-EMD(6-layer; Single model; No DA) 78.7 47.5 93.3 89.8/86.4 87.6/86.8 72.0/89.3 84.7 83.5 90.7 71.7 65.1 9.2
65 θ˜‡ε€§ιˆž SesameBERT-Base 78.6 52.7 94.2 88.9/84.8 86.5/85.5 70.8/88.8 83.7 83.6 91.0 67.6 65.1 35.8
66 xinge ma ReptileDistil 78.5 47.9 92.8 89.2/85.4 87.1/85.9 71.0/89.0 83.6 82.9 90.4 73.5 65.1 33.2
67 MobileBERT Team MobileBERT 78.5 51.1 92.6 88.8/84.5 86.2/84.8 70.5/88.3 84.3 83.4 91.6 70.4 65.1 34.3
68 Linyuan Gong StackingBERT-Base 78.4 56.2 93.9 88.2/83.9 84.2/82.5 70.4/88.7 84.4 84.2 90.1 67.0 65.1 36.6
69 TinyBERT Team TinyBERT (6-layer; Single model) 78.1 51.1 93.1 87.3/82.6 85.0/83.7 71.6/89.1 84.6 83.2 90.4 70.0 65.1 9.2
70 SqueezeBERT Team SqueezeBERT (4.3x faster than BERT-base on smartphone) 78.1 46.5 91.4 89.5/86.0 87.0/86.3 71.5/89.0 82.0 81.1 90.1 73.2 65.1 35.3
71 Anshuman Singh CAMTL 77.9 53.0 92.6 88.3/84.4 86.6/85.9 70.0/88.5 82.3 82.0 90.5 72.8 58.2 33.8
72 ε‚…θ–›ζž— KRISFU 77.8 52.4 92.5 89.0/84.8 83.7/82.2 70.4/88.6 84.3 83.4 90.9 65.9 65.1 36.1
73 ηŽ‹δΈŠ s0 77.8 46.8 92.9 88.9/84.8 87.2/86.5 71.9/89.1 84.5 83.4 90.8 70.9 60.3 35.3
74 Stark Tony Pocket GLUE 77.6 49.3 92.4 89.0/84.6 84.9/84.0 70.1/88.7 84.0 82.8 90.1 67.2 65.1 36.1
75 Pavan Kalyan Reddy Neerudu Pavan Neerudu - BERT 77.6 56.1 93.5 87.6/83.2 85.3/83.8 70.6/88.8 84.0 83.4 90.8 64.0 60.3 34.6
76 NLC MSR Asia BERT-of-Theseus (6-layer; single model) 77.1 47.8 92.2 87.6/83.2 85.6/84.1 71.6/89.3 82.4 82.1 89.6 66.2 65.1 9.2
77 WANG Jiachuan MEmeL_test_bertbase 77.0 52.1 93.9 88.7/84.5 85.9/84.5 71.5/89.3 84.9 83.7 90.5 68.1 52.1 35.6
78 Bruce Lee YKW 76.2 35.3 91.8 87.6/83.2 85.7/84.5 69.9/87.7 84.7 83.9 89.9 72.7 62.3 36.3
79 Hanxiong Huang Hanxiong Huang 75.9 49.3 93.3 87.1/81.9 83.3/81.7 71.5/89.1 84.8 83.8 91.0 64.1 53.4 9.2
80 YeonTaek Oh EL-BERT(6-Layer, Single model) 75.6 47.7 91.0 87.8/83.0 81.2/80.2 69.9/88.1 81.8 81.0 90.2 59.9 65.1 31.8
81 EVS Team Anonymous 74.7 52.6 93.4 87.6/83.2 61.2/59.1 71.8/89.3 83.7 83.2 89.9 65.0 62.3 35.6
82 Chen Money KerasNLP 12/05/2022 Trial 2 74.6 52.2 93.5 87.8/82.6 84.5/83.1 71.3/89.3 82.3 81.6 89.3 61.7 43.8 32.9
83 Sinx ZHIYUAN 74.1 57.0 95.2 91.4/88.4 91.1/90.8 24.2/23.7 87.7 87.3 92.5 81.7 47.9 0.3
84 Tirana Noor Fatyanosa distilbert-base-uncased 73.6 45.8 92.3 87.6/83.1 71.0/71.0 69.6/88.2 81.6 81.3 88.8 54.1 65.1 31.8
85 Haiqin YANG RefBERT 73.1 47.9 92.9 86.9/81.9 75.0/76.3 61.6/84.4 80.9 80.3 87.3 61.7 54.8 -10.3
86 Haiqin Yang RefBERT 73.1 47.9 92.9 86.9/81.9 75.0/76.3 61.4/84.2 80.9 80.3 87.3 61.7 54.8 -10.3
87 Haiqin Yang RefBERT 71.8 36.3 92.9 86.9/81.9 75.0/76.3 61.6/83.8 80.9 80.3 87.3 61.7 54.8 -10.3
88 Haiqin Yang RefBERT 71.8 36.3 92.9 86.9/81.9 75.0/76.3 61.3/83.6 80.9 80.3 87.3 61.7 54.8 -10.3
89 公能公能 1111 71.4 35.8 90.1 83.2/75.7 81.0/79.3 68.5/87.5 77.5 77.1 86.7 58.0 56.8 9.2
90 Jack Hessel Bag-of-words only BoW-BERT (Base) 70.0 14.3 86.7 82.9/75.2 81.8/80.3 68.3/87.5 79.8 79.7 86.2 60.4 65.1 31.0
91 GLUE Baselines BiLSTM+ELMo+Attn 70.0 33.6 90.4 84.4/78.0 74.2/72.3 63.1/84.3 74.1 74.5 79.8 58.9 65.1 21.7
BiLSTM+ELMo 67.7 32.1 89.3 84.7/78.0 70.3/67.8 61.1/82.6 67.2 67.9 75.5 57.4 65.1 21.3
Single Task BiLSTM+ELMo+Attn 66.5 35.0 90.2 80.2/68.8 55.5/52.5 66.1/86.5 76.9 76.7 76.7 50.3 65.1 27.9
Single Task BiLSTM+ELMo 66.4 35.0 90.2 80.8/69.0 64.0/60.2 65.6/85.7 72.9 73.4 71.7 50.1 65.1 19.5
GenSen 66.1 7.7 83.1 83.0/76.6 79.3/79.2 59.8/82.9 71.4 71.3 78.6 59.2 65.1 20.6
BiLSTM+Attn 65.6 18.6 83.0 83.9/76.2 72.8/70.5 60.1/82.4 67.6 68.3 74.3 58.4 65.1 17.8
BiLSTM 64.2 11.6 82.8 81.8/74.3 70.3/67.8 62.5/84.2 65.6 66.1 74.6 57.4 65.1 20.3
InferSent 63.9 4.5 85.1 81.2/74.1 75.9/75.3 59.1/81.7 66.1 65.7 72.7 58.0 65.1 18.3
Single Task BiLSTM 63.7 15.7 85.9 79.4/69.3 66.0/62.8 61.4/81.7 70.3 70.8 75.7 52.8 62.3 21.0
Single Task BiLSTM+CoVe 63.6 14.5 88.5 81.4/73.4 67.2/64.1 59.4/83.3 64.5 64.8 75.4 53.5 61.6 20.6
BiLSTM+CoVe+Attn 63.1 8.3 80.7 80.0/71.8 69.8/68.4 60.5/83.4 68.1 68.6 72.9 56.0 65.1 18.3
Single Task BiLSTM+CoVe+Attn 63.1 14.5 88.5 79.7/68.6 57.2/53.6 60.1/84.1 71.6 71.5 74.5 52.7 64.4 23.8
BiLSTM+CoVe 62.9 18.5 81.9 78.7/71.5 64.4/62.7 60.6/84.9 65.4 65.7 70.8 52.7 65.1 17.6
Single Task BiLSTM+Attn 62.8 15.7 85.9 80.3/68.5 59.3/55.8 62.9/83.5 74.2 73.8 77.2 51.9 55.5 24.9
DisSent 61.9 4.9 83.7 81.7/74.1 66.1/64.8 59.5/82.6 58.7 59.1 73.9 56.4 65.1 15.9
Skip-Thought 61.3 0.0 81.8 80.8/71.7 71.8/69.7 56.4/82.2 62.9 62.8 72.9 53.1 65.1 12.2
CBOW 58.6 0.0 80.0 81.5/73.4 61.2/58.7 51.4/79.1 56.0 56.4 72.1 54.1 62.3 9.2
92 XLNet Team XLNet (ensemble) - 70.2 97.1 92.9/90.5 93.0/92.6 74.7/90.4 90.9 90.9 - 88.5 92.5 48.4
93 ALBERT-Team Google Language ALBERT (Ensemble) - 69.1 97.1 93.4/91.2 92.5/92.0 74.2/90.5 91.3 91.0 - 89.2 91.8 50.2