Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
jamesoneill12
/
outputs-dpo
like
0
Text Generation
Transformers
Safetensors
snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset
llama
alignment-handbook
trl
dpo
Generated from Trainer
conversational
Inference Endpoints
text-generation-inference
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
outputs-dpo
Commit History
End of training
f4a81a3
verified
jamesoneill12
commited on
Mar 18
Model save
031d11d
verified
jamesoneill12
commited on
Mar 18
Training in progress, epoch 7
07e7d22
verified
jamesoneill12
commited on
Mar 18
Training in progress, epoch 6
4dbb910
verified
jamesoneill12
commited on
Mar 18
Training in progress, epoch 4
5d5ed72
verified
jamesoneill12
commited on
Mar 18
Training in progress, epoch 0
7df3824
verified
jamesoneill12
commited on
Mar 18
Training in progress, step 400
a7259d1
verified
jamesoneill12
commited on
Mar 18
Training in progress, step 300
2718a02
verified
jamesoneill12
commited on
Mar 18
Training in progress, step 200
ea4ce45
verified
jamesoneill12
commited on
Mar 18
initial commit
7e2fc8e
verified
jamesoneill12
commited on
Mar 17