Update README.md
Browse files
README.md
CHANGED
@@ -89,7 +89,19 @@ or
|
|
89 |
|
90 |
### Fine-tuning procedure
|
91 |
|
92 |
-
The OpenCALM-7B model was fine-tuned on the above dataset using the QLoRA method with prompt `この段落の要約{length}生成:{source}\n`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
93 |
|
94 |
## Evaluation
|
95 |
|
@@ -99,8 +111,8 @@ We evaluated the model on two sets: one for *multi-topic* summarization and the
|
|
99 |
|
100 |
### Results
|
101 |
|
102 |
-
|Solution/Model|ROUGE-L <br> (multi-topic)|ROUGE-L <br> (single-topic)|
|
103 |
-
|
104 |
|1st place solution* |34.12 |**34.44**|
|
105 |
|2nd place solution* |32.79 |33.65 |
|
106 |
|*OpenCALM-7B (QLoRA)*|***36.75***|*33.31* |
|
|
|
89 |
|
90 |
### Fine-tuning procedure
|
91 |
|
92 |
+
The OpenCALM-7B model was fine-tuned on the above dataset using the QLoRA method with prompt `この段落の要約{length}生成:{source}\n`. We outline the following hyperparameters:
|
93 |
+
|
94 |
+
|||
|
95 |
+
|----------------|----------------:|
|
96 |
+
| **Optimizer** <br>   beta_1 <br>   beta_2 <br>   weight decay | AdamW <br> 0.9 <br> 0.999 <br> 0.01 |
|
97 |
+
| **Learning rate** <br>   scheduler type | 2e-5 <br> linear |
|
98 |
+
| **LoRA** <br>   target modules <br>   r <br>   alpha <br>   dropout | <br> query_key_value, dense <br> 4 <br> 64 <br> 0.05 |
|
99 |
+
| **QLoRA** <br>   compute dtype <br>   storage dtype <br>   quantization strategy | <br> float16 <br> nf4 <br> double quantization |
|
100 |
+
| **Sequence length** | 1536 |
|
101 |
+
| **Batch size** | 4 |
|
102 |
+
| **Gradient accumulation steps** | 2 |
|
103 |
+
| **Epochs** | 10 |
|
104 |
+
| **Warmup steps** | 200 |
|
105 |
|
106 |
## Evaluation
|
107 |
|
|
|
111 |
|
112 |
### Results
|
113 |
|
114 |
+
| Solution/Model | ROUGE-L <br> (multi-topic) | ROUGE-L <br> (single-topic) |
|
115 |
+
|----------------|:--------------------------:|:---------------------------:|
|
116 |
|1st place solution* |34.12 |**34.44**|
|
117 |
|2nd place solution* |32.79 |33.65 |
|
118 |
|*OpenCALM-7B (QLoRA)*|***36.75***|*33.31* |
|