lm-human-preference-details vwxyzjn/train_policy_accelerate__sentiment_offline_5k.json__seed1__1696447674 Text Generation • Updated Oct 4, 2023 • 2 lm-human-preference-details/train_policy_accelerate__sentiment_offline_5k.json__seed1 Text Generation • Updated Oct 4, 2023 • 1
vwxyzjn/train_policy_accelerate__sentiment_offline_5k.json__seed1__1696447674 Text Generation • Updated Oct 4, 2023 • 2
lm-human-preference-details/train_policy_accelerate__sentiment_offline_5k.json__seed1 Text Generation • Updated Oct 4, 2023 • 1
vwxyzjn/ppo_zephyr_vllm_warmup_1e-6_larger_bs_300k_episodes Text Generation • Updated 6 days ago • 16