Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
DUAL-GPO
/
phi-2-gpo-v8-i1
like
0
Follow
DUAL Group
2
PEFT
TensorBoard
Safetensors
HuggingFaceH4/ultrafeedback_binarized
phi
alignment-handbook
Generated from Trainer
trl
dpo
custom_code
License:
apache-2.0
Model card
Files
Files and versions
Metrics
Training metrics
Community
Use this model
0fa1ca0
phi-2-gpo-v8-i1
/
runs
Commit History
Training in progress, step 1800
0fa1ca0
verified
lole25
commited on
May 12
Training in progress, step 1600
0eeea37
verified
lole25
commited on
May 12
Training in progress, step 1500
0fe3c9b
verified
lole25
commited on
May 12
Training in progress, step 1400
d7af4cc
verified
lole25
commited on
May 12
Training in progress, step 1300
df5b059
verified
lole25
commited on
May 12
Training in progress, step 1200
1f9fc47
verified
lole25
commited on
May 12
Training in progress, step 1100
33f171b
verified
lole25
commited on
May 12
Training in progress, step 1000
7d4d39c
verified
lole25
commited on
May 12
Training in progress, step 900
ec8501a
verified
lole25
commited on
May 12
Training in progress, step 800
7794131
verified
lole25
commited on
May 12
Training in progress, step 700
35a0537
verified
lole25
commited on
May 12
Training in progress, step 600
a90770a
verified
lole25
commited on
May 12
Training in progress, step 500
c6ea2ee
verified
lole25
commited on
May 12
Training in progress, step 400
f5f4588
verified
lole25
commited on
May 12
Training in progress, step 200
3e43bd6
verified
lole25
commited on
May 12
Training in progress, step 100
3fff253
verified
lole25
commited on
May 12