arxiv:2405.07863
Wei Xiong
weqweasdas
AI & ML interests
Machine learning, RLHF
Organizations
models
23
weqweasdas/zephyr-7b-dpo-full
Text Generation
•
Updated
•
9
weqweasdas/zephyr-7b-gemma-dpo
Updated
weqweasdas/zephyr-7b-sft-full
Updated
weqweasdas/zephyr-7b-dpo-qlora
Updated
weqweasdas/gpt2-cpt-dutch
Text Generation
•
Updated
•
12
weqweasdas/zephyr-7b-gemma-sft
Updated
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6_weight085
Text Generation
•
Updated
•
11
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6
Text Generation
•
Updated
•
10
weqweasdas/raft_baseline_zephyr_packing_model6
Text Generation
•
Updated
•
11
weqweasdas/raft_baseline_openchat_llama13b_model1
Text Generation
•
Updated
•
11
datasets
43
weqweasdas/xxxx_test
Viewer
•
Updated
•
100
weqweasdas/xxxx.json
Viewer
•
Updated
•
100
weqweasdas/ds_armo_final
Viewer
•
Updated
•
805
•
2
weqweasdas/ds_bt_final
Viewer
•
Updated
•
805
•
1
weqweasdas/bo8_bt
Viewer
•
Updated
•
805
weqweasdas/bo2_bt
Viewer
•
Updated
•
805
weqweasdas/bo2_armo
Viewer
•
Updated
•
805
weqweasdas/bo8_armo
Viewer
•
Updated
•
805
weqweasdas/ds_armo
Viewer
•
Updated
•
805
•
3
weqweasdas/ds_pm
Viewer
•
Updated
•
805
•
3