Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
DUAL-GPO
/
zephyr-7b-gpo-v4-i3
like
0
Follow
DUAL Group
2
PEFT
TensorBoard
Safetensors
HuggingFaceH4/ultrafeedback_binarized
mistral
alignment-handbook
Generated from Trainer
trl
dpo
License:
apache-2.0
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
Use this model
main
zephyr-7b-gpo-v4-i3
Commit History
End of training
e1189c0
verified
lole25
commited on
May 14
Model save
3a19861
verified
lole25
commited on
May 14
Training in progress, step 1800
655ade3
verified
lole25
commited on
May 14
Training in progress, step 1600
5beb170
verified
lole25
commited on
May 14
Training in progress, step 1200
d1dc0a1
verified
lole25
commited on
May 14
Training in progress, step 700
4b0cf5d
verified
lole25
commited on
May 14
Training in progress, step 600
f383737
verified
lole25
commited on
May 14
Training in progress, step 400
641386d
verified
lole25
commited on
May 14
Training in progress, step 300
b41ea36
verified
lole25
commited on
May 14
Training in progress, step 100
21a01b1
verified
lole25
commited on
May 14
initial commit
b0a0378
verified
lole25
commited on
May 14