leafspark commited on
Commit
7e381ea
·
verified ·
1 Parent(s): b748441

Add loss graph

Browse files
Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -29,16 +29,18 @@ pipeline_tag: text-generation
29
  This model is inspired by OpenAI's o1 reasoning model. The dataset was synthetically generated using Claude 3.5 Sonnet.
30
 
31
  ### Training
 
32
 
33
- The training was done on Google Colab's free T4, using unsloth. The configuration is as follows:
34
  - LoRA Rank: 128
35
  - Packing: enabled
36
  - Batch size: 2
37
  - Gradient accumulation steps: 4
38
  - Epoches: 3
39
- - Steps: 240
 
40
 
41
- The training data comprised of 81 examples, each approximatly 3000 tokens.
42
 
43
  ### Notes
44
  - It tends to produce very verbose and long reasoning responses
 
29
  This model is inspired by OpenAI's o1 reasoning model. The dataset was synthetically generated using Claude 3.5 Sonnet.
30
 
31
  ### Training
32
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6604e5b21eb292d6df393365/pGUUKI77dP9NBVdLf1zB0.png)
33
 
34
+ The training was done on Google Colab's free T4, using unsloth (duration: 52.32 minutes). The configuration is as follows:
35
  - LoRA Rank: 128
36
  - Packing: enabled
37
  - Batch size: 2
38
  - Gradient accumulation steps: 4
39
  - Epoches: 3
40
+ - Steps: 30
41
+ - Max sequence length: 4096
42
 
43
+ The training data comprised of 81 examples, each approximatly 3000 tokens.
44
 
45
  ### Notes
46
  - It tends to produce very verbose and long reasoning responses