lgcharpe commited on
Commit
dbe8f4e
1 Parent(s): c56dba2

Create README

Browse files
Files changed (1) hide show
  1. README +23 -0
README ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Hyperparameters for GLUE:
2
+ - Learning rate: 5e-5
3
+ - Batch size: 64
4
+ - Max epochs: 10
5
+ - Patience: 10 (for CoLA, MRPC, RTE, BoolQ, MultiRC, and WSC), 100 (for MNLI, QQP, QNLI, and SST-2)
6
+ - Random seed: 12
7
+ - Weight decay: 0.1
8
+ - Warmup ratio: 0.1
9
+ - Learning rate scheduler: cosine
10
+ - Eval strategy: epoch (for CoLA, MRPC, RTE, BoolQ, MultiRC, and WSC), steps (for MNLI, QQP, QNLI, and SST-2)
11
+ - Eval every: 1 (for CoLA, MRPC, RTE, BoolQ, MultiRC, and WSC), 200 (for SST-2 and QNLI), 500 (for MNLI and QQP)
12
+
13
+ Hyperparameters for MSGS:
14
+ - Learning rate: 5e-5 (for CR, SC, RP, MV_RTP, and SC_LC), 1.5e-5 (for LC), 1e-5 (for SC_RP), 8e-6 (for MV_LC), 5e-6 (for MV), 5e-7 (CR_LC)
15
+ - Batch size: 32
16
+ - Max epochs: 10 (for CR, SC, RP, MV_RTP, SC_LC, SC_RP, MV, and CR_LC), 3 (for LC), 5 (for MV_LC)
17
+ - Patience: 10 (for CR, SC, RP, MV_RTP, SC_LC, SC_RP, MV, and CR_LC), 3 (for LC), 5 (for MV_LC)
18
+ - Random seed: 12
19
+ - Weight decay: 0.1
20
+ - Warmup ratio: 0.1
21
+ - Learning rate scheduler: cosine
22
+ - Eval strategy: epoch
23
+ - Eval every: 1