hamishivi commited on
Commit
ac51e98
1 Parent(s): 4a372aa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -74,7 +74,7 @@ We have included a [chat template](https://huggingface.co/docs/transformers/main
74
  | preference_big_mixture | = | [tulu-v2.5-ppo-13b-uf-mean-13b-mix-rm](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-13b-mix-rm) | [tulu-v2.5-13b-preference-mix-rm](https://huggingface.co/allenai/tulu-v2.5-13b-preference-mix-rm) | [tulu-v2.5-ppo-13b-uf-mean-13b-mix-rm-value](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-13b-mix-rm-value) |
75
  | preference_big_mixture | = | [tulu-v2.5-ppo-13b-uf-mean-70b-mix-rm](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-70b-mix-rm) | [tulu-v2.5-70b-preference-mix-rm](https://huggingface.co/allenai/tulu-v2.5-70b-preference-mix-rm) | [tulu-v2.5-ppo-13b-uf-mean-70b-mix-rm-value](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-70b-mix-rm-value) |
76
  | ultrafeedback_mean_aspects | = | [tulu-v2.5-ppo-13b-uf-mean](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean) | [tulu-v2.5-13b-uf-rm](https://huggingface.co/allenai/tulu-v2.5-13b-uf-rm) | [tulu-v2.5-ppo-13b-uf-mean-13b-uf-rm-value](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-13b-uf-rm-value) |
77
- | preference_big_mixture | = | [tulu-v2.5-ppo-13b-uf-mean-70b-uf-rm-mixed-prompts](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-70b-uf-rm-mixed-prompts) | [tulu-v2.5-70b-uf-rm](https://huggingface.co/allenai/tulu-v2.5-70b-uf-rm) * with extra prompts | [tulu-v2.5-ppo-13b-uf-mean-70b-uf-rm-mixed-prompts-value](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-70b-uf-rm-mixed-prompts-value) |
78
  | hh_rlhf_60k | [tulu-v2.5-dpo-13b-hh-rlhf-60k](https://huggingface.co/allenai/tulu-v2.5-dpo-13b-hh-rlhf-60k) | [tulu-v2.5-ppo-13b-hh-rlhf-60k](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-hh-rlhf-60k) | [tulu-v2.5-13b-hh-rlhf-60k-rm](https://huggingface.co/allenai/tulu-v2.5-13b-hh-rlhf-60k-rm) | |
79
  | chatbot_arena_2023 | [tulu-v2.5-dpo-13b-chatbot-arena-2023](https://huggingface.co/allenai/tulu-v2.5-dpo-13b-chatbot-arena-2023) | [tulu-v2.5-ppo-13b-chatbot-arena-2023](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-chatbot-arena-2023) | [tulu-v2.5-13b-chatbot-arena-2023-rm](https://huggingface.co/allenai/tulu-v2.5-13b-chatbot-arena-2023-rm) | |
80
  | stack_exchange_60k | [tulu-v2.5-dpo-13b-stackexchange-60k](https://huggingface.co/allenai/tulu-v2.5-dpo-13b-stackexchange-60k) | [tulu-v2.5-ppo-13b-stackexchange-60k](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-stackexchange-60k) | [tulu-v2.5-13b-stackexchange-60k-rm](https://huggingface.co/allenai/tulu-v2.5-13b-stackexchange-60k-rm) | |
 
74
  | preference_big_mixture | = | [tulu-v2.5-ppo-13b-uf-mean-13b-mix-rm](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-13b-mix-rm) | [tulu-v2.5-13b-preference-mix-rm](https://huggingface.co/allenai/tulu-v2.5-13b-preference-mix-rm) | [tulu-v2.5-ppo-13b-uf-mean-13b-mix-rm-value](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-13b-mix-rm-value) |
75
  | preference_big_mixture | = | [tulu-v2.5-ppo-13b-uf-mean-70b-mix-rm](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-70b-mix-rm) | [tulu-v2.5-70b-preference-mix-rm](https://huggingface.co/allenai/tulu-v2.5-70b-preference-mix-rm) | [tulu-v2.5-ppo-13b-uf-mean-70b-mix-rm-value](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-70b-mix-rm-value) |
76
  | ultrafeedback_mean_aspects | = | [tulu-v2.5-ppo-13b-uf-mean](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean) | [tulu-v2.5-13b-uf-rm](https://huggingface.co/allenai/tulu-v2.5-13b-uf-rm) | [tulu-v2.5-ppo-13b-uf-mean-13b-uf-rm-value](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-13b-uf-rm-value) |
77
+ | ultrafeedback_mean_aspects | = | [tulu-v2.5-ppo-13b-uf-mean-70b-uf-rm-mixed-prompts](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-70b-uf-rm-mixed-prompts) | [tulu-v2.5-70b-uf-rm](https://huggingface.co/allenai/tulu-v2.5-70b-uf-rm) * with extra prompts | [tulu-v2.5-ppo-13b-uf-mean-70b-uf-rm-mixed-prompts-value](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-uf-mean-70b-uf-rm-mixed-prompts-value) |
78
  | hh_rlhf_60k | [tulu-v2.5-dpo-13b-hh-rlhf-60k](https://huggingface.co/allenai/tulu-v2.5-dpo-13b-hh-rlhf-60k) | [tulu-v2.5-ppo-13b-hh-rlhf-60k](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-hh-rlhf-60k) | [tulu-v2.5-13b-hh-rlhf-60k-rm](https://huggingface.co/allenai/tulu-v2.5-13b-hh-rlhf-60k-rm) | |
79
  | chatbot_arena_2023 | [tulu-v2.5-dpo-13b-chatbot-arena-2023](https://huggingface.co/allenai/tulu-v2.5-dpo-13b-chatbot-arena-2023) | [tulu-v2.5-ppo-13b-chatbot-arena-2023](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-chatbot-arena-2023) | [tulu-v2.5-13b-chatbot-arena-2023-rm](https://huggingface.co/allenai/tulu-v2.5-13b-chatbot-arena-2023-rm) | |
80
  | stack_exchange_60k | [tulu-v2.5-dpo-13b-stackexchange-60k](https://huggingface.co/allenai/tulu-v2.5-dpo-13b-stackexchange-60k) | [tulu-v2.5-ppo-13b-stackexchange-60k](https://huggingface.co/allenai/tulu-v2.5-ppo-13b-stackexchange-60k) | [tulu-v2.5-13b-stackexchange-60k-rm](https://huggingface.co/allenai/tulu-v2.5-13b-stackexchange-60k-rm) | |