thatupiso commited on
Commit
82a3989
1 Parent(s): 64ac2f5

End of training

Browse files
README.md CHANGED
@@ -4,8 +4,7 @@ library_name: transformers
4
  model_name: SmolLM2-FT-DPO2
5
  tags:
6
  - generated_from_trainer
7
- - smol-course
8
- - module_1
9
  - trl
10
  - dpo
11
  licence: license
@@ -29,7 +28,7 @@ print(output["generated_text"])
29
 
30
  ## Training procedure
31
 
32
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/thatupiso-code-org/huggingface/runs/qr19ujp2)
33
 
34
  This model was trained with DPO, a method introduced in [Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://huggingface.co/papers/2305.18290).
35
 
 
4
  model_name: SmolLM2-FT-DPO2
5
  tags:
6
  - generated_from_trainer
7
+ - dpo-smolK12-100
 
8
  - trl
9
  - dpo
10
  licence: license
 
28
 
29
  ## Training procedure
30
 
31
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/thatupiso-code-org/huggingface/runs/xpcn3ywm)
32
 
33
  This model was trained with DPO, a method introduced in [Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://huggingface.co/papers/2305.18290).
34
 
runs/Dec12_20-07-50_3393362a4d02/events.out.tfevents.1734034093.3393362a4d02.7235.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c4fe8439c4d0e4bf0e674877c5e8b1f87e3d1eca8060c9c4efde206930a3d294
3
+ size 13260
runs/Dec12_20-23-09_3393362a4d02/events.out.tfevents.1734035019.3393362a4d02.7235.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0f9f0e0fe7b9f89ae5720ca6baf57453966f4c8c3158a921f02bb12b8d8aa3bf
3
+ size 27116
runs/Dec12_20-24-40_3393362a4d02/events.out.tfevents.1734035190.3393362a4d02.7235.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:853133961a02f0d971d003a31cb5cef46086c39367d37c950dd84c673478a3d0
3
+ size 40308
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a1c3b48e802c46b7298aea7b9c84993251d389226ee3bef09e82528f7370ec07
3
  size 6072
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:82c7343ca0657002fe93055a64e96268492c8ea1017027804786a56752e20079
3
  size 6072