Add loss graph
Browse files
README.md
CHANGED
@@ -29,16 +29,18 @@ pipeline_tag: text-generation
|
|
29 |
This model is inspired by OpenAI's o1 reasoning model. The dataset was synthetically generated using Claude 3.5 Sonnet.
|
30 |
|
31 |
### Training
|
|
|
32 |
|
33 |
-
The training was done on Google Colab's free T4, using unsloth. The configuration is as follows:
|
34 |
- LoRA Rank: 128
|
35 |
- Packing: enabled
|
36 |
- Batch size: 2
|
37 |
- Gradient accumulation steps: 4
|
38 |
- Epoches: 3
|
39 |
-
- Steps:
|
|
|
40 |
|
41 |
-
The training data comprised of 81 examples, each approximatly 3000 tokens.
|
42 |
|
43 |
### Notes
|
44 |
- It tends to produce very verbose and long reasoning responses
|
|
|
29 |
This model is inspired by OpenAI's o1 reasoning model. The dataset was synthetically generated using Claude 3.5 Sonnet.
|
30 |
|
31 |
### Training
|
32 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6604e5b21eb292d6df393365/pGUUKI77dP9NBVdLf1zB0.png)
|
33 |
|
34 |
+
The training was done on Google Colab's free T4, using unsloth (duration: 52.32 minutes). The configuration is as follows:
|
35 |
- LoRA Rank: 128
|
36 |
- Packing: enabled
|
37 |
- Batch size: 2
|
38 |
- Gradient accumulation steps: 4
|
39 |
- Epoches: 3
|
40 |
+
- Steps: 30
|
41 |
+
- Max sequence length: 4096
|
42 |
|
43 |
+
The training data comprised of 81 examples, each approximatly 3000 tokens.
|
44 |
|
45 |
### Notes
|
46 |
- It tends to produce very verbose and long reasoning responses
|