leafspark
/

Llama-3.1-8B-MultiReflection-Instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

leafspark commited on Sep 15, 2024

Commit

7e381ea

·

verified ·

1 Parent(s): b748441

Add loss graph

Files changed (1) hide show

README.md +5 -3

README.md CHANGED Viewed

@@ -29,16 +29,18 @@ pipeline_tag: text-generation
 This model is inspired by OpenAI's o1 reasoning model. The dataset was synthetically generated using Claude 3.5 Sonnet.
 ### Training
-The training was done on Google Colab's free T4, using unsloth. The configuration is as follows:
 - LoRA Rank: 128
 - Packing: enabled
 - Batch size: 2
 - Gradient accumulation steps: 4
 - Epoches: 3
-- Steps: 240
-The training data comprised of 81 examples, each approximatly 3000 tokens.
 ### Notes
 - It tends to produce very verbose and long reasoning responses

 This model is inspired by OpenAI's o1 reasoning model. The dataset was synthetically generated using Claude 3.5 Sonnet.
 ### Training
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6604e5b21eb292d6df393365/pGUUKI77dP9NBVdLf1zB0.png)
+The training was done on Google Colab's free T4, using unsloth (duration: 52.32 minutes). The configuration is as follows:
 - LoRA Rank: 128
 - Packing: enabled
 - Batch size: 2
 - Gradient accumulation steps: 4
 - Epoches: 3
+- Steps: 30
+- Max sequence length: 4096
+The training data comprised of 81 examples, each approximatly 3000 tokens.
 ### Notes
 - It tends to produce very verbose and long reasoning responses