Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
wxzhang
/
dpo-selective-alpaca
like
0
Text Generation
Transformers
Safetensors
PKU-Alignment/PKU-SafeRLHF
llama
alignment-handbook
Generated from Trainer
trl
dpo
conversational
text-generation-inference
Inference Endpoints
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
dpo-selective-alpaca
Commit History
End of training
83248f0
verified
wxzhang
commited on
Apr 23
Model save
fb2ade4
verified
wxzhang
commited on
Apr 23
Training in progress, step 4500
ab07ccb
verified
wxzhang
commited on
Apr 23
Training in progress, step 4000
fff0b54
verified
wxzhang
commited on
Apr 23
Training in progress, step 3500
b801c0e
verified
wxzhang
commited on
Apr 23
Training in progress, step 3000
b81e633
verified
wxzhang
commited on
Apr 23
Training in progress, step 2500
66a4c82
verified
wxzhang
commited on
Apr 22
Training in progress, step 2000
b53a193
verified
wxzhang
commited on
Apr 22
Training in progress, step 1500
6ee5d60
verified
wxzhang
commited on
Apr 22
Training in progress, step 1000
862dee9
verified
wxzhang
commited on
Apr 22
Training in progress, step 500
4bd56a1
verified
wxzhang
commited on
Apr 22
initial commit
97788ed
verified
wxzhang
commited on
Apr 22