MichaelFFan
/

dpo_model

Model card Files Files and versions Community

MichaelFFan commited on 16 days ago

Commit

90276f3

•

1 Parent(s): f94939a

Update README.md

Files changed (1) hide show

README.md +11 -4

README.md CHANGED Viewed

@@ -19,11 +19,18 @@ This model is a fine-tuned version of `meta-llama/Llama-3.2-1B`, adapted using D
 - **License:** MIT
 - **Finetuned from model:** meta-llama/Llama-3.2-1B
-### Model Sources
-- **Repository:** [Link to the Hugging Face model repo]
-- **Paper [optional]:** [If applicable, link to related research or documentation]
-- **Demo [optional]:** [If a demo exists, provide a link]
 ## Uses

 - **License:** MIT
 - **Finetuned from model:** meta-llama/Llama-3.2-1B
+## Training Hyperparameters
+- **Training regime:** Mixed precision (fp16)
+- **Learning rate:** 2e-4
+- **Batch size:** 8
+- **Number of epochs:** 3
+- **Optimizer:** AdamW with 8-bit precision
+- **Max sequence length:** 512 tokens
+- **Warmup steps:** 100
+- **Weight decay:** 0.01
 ## Uses