trl-lib
/
pythia-6.9b-deduped-tldr-online-dpo

Model card Files Files and versions Metrics Training metrics Community