Update README.md
Browse files
README.md
CHANGED
@@ -96,7 +96,7 @@ The OpenCALM-7B model was fine-tuned on the above dataset using the QLoRA method
|
|
96 |
| **Optimizer** <br>   beta_1 <br>   beta_2 <br>   weight decay | AdamW <br> 0.9 <br> 0.999 <br> 0.01 |
|
97 |
| **Learning rate** <br>   scheduler type | 2e-5 <br> linear |
|
98 |
| **LoRA** <br>   target modules <br>   r <br>   alpha <br>   dropout | <br> query_key_value, dense <br> 4 <br> 64 <br> 0.05 |
|
99 |
-
| **QLoRA** <br>   compute dtype <br>   storage dtype <br>   quantization strategy | <br> float16 <br> nf4 <br> double quantization |
|
100 |
| **Sequence length** | 1536 |
|
101 |
| **Batch size** | 4 |
|
102 |
| **Gradient accumulation steps** | 2 |
|
|
|
96 |
| **Optimizer** <br>   beta_1 <br>   beta_2 <br>   weight decay | AdamW <br> 0.9 <br> 0.999 <br> 0.01 |
|
97 |
| **Learning rate** <br>   scheduler type | 2e-5 <br> linear |
|
98 |
| **LoRA** <br>   target modules <br>   r <br>   alpha <br>   dropout | <br> query_key_value, dense <br> 4 <br> 64 <br> 0.05 |
|
99 |
+
| **Quantization (for QLoRA)** <br>   compute dtype <br>   storage dtype <br>   quantization strategy | <br> float16 <br> nf4 <br> double quantization |
|
100 |
| **Sequence length** | 1536 |
|
101 |
| **Batch size** | 4 |
|
102 |
| **Gradient accumulation steps** | 2 |
|