haih2
/

open-calm-7b-summarizer-lora

Text Generation

Model card Files Files and versions Community

haih2 commited on Aug 31, 2023

Commit

0da2af5

·

1 Parent(s): dc85892

Update README.md

Files changed (1) hide show

README.md +15 -3

README.md CHANGED Viewed

@@ -89,7 +89,19 @@ or
 ### Fine-tuning procedure
-The OpenCALM-7B model was fine-tuned on the above dataset using the QLoRA method with prompt `この段落の要約{length}生成:{source}\n`.
 ## Evaluation
@@ -99,8 +111,8 @@ We evaluated the model on two sets: one for *multi-topic* summarization and the
 ### Results
-|Solution/Model|ROUGE-L <br> (multi-topic)|ROUGE-L <br> (single-topic)|
-|:------------:|:------------------------:|:-------------------------:|
 |1st place solution*  |34.12      |**34.44**|
 |2nd place solution*  |32.79      |33.65    |
 |*OpenCALM-7B (QLoRA)*|***36.75***|*33.31*  |

 ### Fine-tuning procedure
+The OpenCALM-7B model was fine-tuned on the above dataset using the QLoRA method with prompt `この段落の要約{length}生成:{source}\n`. We outline the following hyperparameters:
+|||
+|----------------|----------------:|
+| **Optimizer** <br> &emsp; beta_1 <br> &emsp; beta_2 <br> &emsp; weight decay | AdamW <br> 0.9 <br> 0.999 <br> 0.01 |
+| **Learning rate** <br> &emsp; scheduler type | 2e-5 <br> linear |
+| **LoRA** <br> &emsp; target modules <br> &emsp; r <br> &emsp; alpha <br> &emsp; dropout | <br> query_key_value, dense <br> 4 <br> 64 <br> 0.05 |
+| **QLoRA** <br> &emsp; compute dtype <br> &emsp; storage dtype <br> &emsp; quantization strategy | <br> float16 <br> nf4 <br> double quantization |
+| **Sequence length**             | 1536 |
+| **Batch size**                  | 4    |
+| **Gradient accumulation steps** | 2    |
+| **Epochs**                      | 10   |
+| **Warmup steps**                | 200  |
 ## Evaluation
 ### Results
+| Solution/Model | ROUGE-L <br> (multi-topic) | ROUGE-L <br> (single-topic) |
+|----------------|:--------------------------:|:---------------------------:|
 |1st place solution*  |34.12      |**34.44**|
 |2nd place solution*  |32.79      |33.65    |
 |*OpenCALM-7B (QLoRA)*|***36.75***|*33.31*  |