haih2 commited on
Commit
0da2af5
1 Parent(s): dc85892

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -3
README.md CHANGED
@@ -89,7 +89,19 @@ or
89
 
90
  ### Fine-tuning procedure
91
 
92
- The OpenCALM-7B model was fine-tuned on the above dataset using the QLoRA method with prompt `この段落の要約{length}生成:{source}\n`.
 
 
 
 
 
 
 
 
 
 
 
 
93
 
94
  ## Evaluation
95
 
@@ -99,8 +111,8 @@ We evaluated the model on two sets: one for *multi-topic* summarization and the
99
 
100
  ### Results
101
 
102
- |Solution/Model|ROUGE-L <br> (multi-topic)|ROUGE-L <br> (single-topic)|
103
- |:------------:|:------------------------:|:-------------------------:|
104
  |1st place solution* |34.12 |**34.44**|
105
  |2nd place solution* |32.79 |33.65 |
106
  |*OpenCALM-7B (QLoRA)*|***36.75***|*33.31* |
 
89
 
90
  ### Fine-tuning procedure
91
 
92
+ The OpenCALM-7B model was fine-tuned on the above dataset using the QLoRA method with prompt `この段落の要約{length}生成:{source}\n`. We outline the following hyperparameters:
93
+
94
+ |||
95
+ |----------------|----------------:|
96
+ | **Optimizer** <br> &emsp; beta_1 <br> &emsp; beta_2 <br> &emsp; weight decay | AdamW <br> 0.9 <br> 0.999 <br> 0.01 |
97
+ | **Learning rate** <br> &emsp; scheduler type | 2e-5 <br> linear |
98
+ | **LoRA** <br> &emsp; target modules <br> &emsp; r <br> &emsp; alpha <br> &emsp; dropout | <br> query_key_value, dense <br> 4 <br> 64 <br> 0.05 |
99
+ | **QLoRA** <br> &emsp; compute dtype <br> &emsp; storage dtype <br> &emsp; quantization strategy | <br> float16 <br> nf4 <br> double quantization |
100
+ | **Sequence length** | 1536 |
101
+ | **Batch size** | 4 |
102
+ | **Gradient accumulation steps** | 2 |
103
+ | **Epochs** | 10 |
104
+ | **Warmup steps** | 200 |
105
 
106
  ## Evaluation
107
 
 
111
 
112
  ### Results
113
 
114
+ | Solution/Model | ROUGE-L <br> (multi-topic) | ROUGE-L <br> (single-topic) |
115
+ |----------------|:--------------------------:|:---------------------------:|
116
  |1st place solution* |34.12 |**34.44**|
117
  |2nd place solution* |32.79 |33.65 |
118
  |*OpenCALM-7B (QLoRA)*|***36.75***|*33.31* |