lm-human-preference-details vwxyzjn/train_policy_accelerate__sentiment_offline_5k.json__seed1__1696447674 Text Generation • Updated Oct 4, 2023 • 3 lm-human-preference-details/train_policy_accelerate__sentiment_offline_5k.json__seed1 Text Generation • Updated Oct 4, 2023 • 1
vwxyzjn/train_policy_accelerate__sentiment_offline_5k.json__seed1__1696447674 Text Generation • Updated Oct 4, 2023 • 3
lm-human-preference-details/train_policy_accelerate__sentiment_offline_5k.json__seed1 Text Generation • Updated Oct 4, 2023 • 1
vwxyzjn/ppo_zephyr_vllm_warmup_1e-6_larger_bs_300k_episodes Text Generation • Updated 2 days ago • 15
vwxyzjn/summarize_from_feedback_tldr_3_filtered_oai_preprocessing_1711138793 Viewer • Updated Mar 22
vwxyzjn/summarize_from_feedback_tldr_3_filtered_oai_preprocessing_1711138084 Viewer • Updated Mar 22