Yiran Wang
yiran-wang3
AI & ML interests
Reinforcement Learning, Self-Driving
Organizations
Collections
3
sppo training with original gt (no explaination)
-
yiran-wang3/ds_chat_cosine_original_sppo-GT_ORIGINAL-sppo-0.1-cos-rmsp-1e-7-checkpoint-391
Text Generation • Updated • 4 -
yiran-wang3/ds_chat_cosine_original_sppo-GT_ORIGINAL-sppo-0.1-cos-rmsp-1e-7-checkpoint-1173
Text Generation • Updated • 4 -
yiran-wang3/ds_chat_cosine_original_sppo-GT_ORIGINAL-sppo-0.1-cos-rmsp-1e-7-checkpoint-782
Text Generation • Updated • 3 -
yiran-wang3/ds_chat_cosine_original_sppo-GT_ORIGINAL_MASKED-sppo-0.1-cos-rmsp-1e-7-checkpoint-391
Text Generation • Updated • 3
models
117
yiran-wang3/ds_chat_with_text_mask_sppo_hard_new_iter0_2024-10-05-12.16
Updated
yiran-wang3/ds_chat_sppo_hard_new_iter0_2024-10-05-12.12
Updated
yiran-wang3/ds_chat_adamw_iter1_sppo_hard_new_iter1_2024-10-01-10.32
Updated
•
17
yiran-wang3/ds_chat_sppo_hard_new_iter0_yite_config_topp09_temp07_rmsprop
Updated
•
3
yiran-wang3/ds_chat_sppo_hard_new_iter0_yite_config_topp09_temp07_adamw
Updated
•
86
yiran-wang3/ds_chat_sppo_hard_new_iter0_constant_1e-6
Updated
•
23
yiran-wang3/ds_chat_sppo_hard_cosine_iter0_2024-09-17-09.48
Updated
•
1
yiran-wang3/ds_chat_sppo_hard_cosine_iter0_2024-09-16-21.02
Updated
•
4
yiran-wang3/ds_chat_sppo_hard_cosine_iter0_masked_cosine_schedule
Updated
•
2
yiran-wang3/ds_chat_sppo_hard_cosine_iter0_2024-09-16-15.36
Updated
•
2
datasets
18
yiran-wang3/None-full_response_traceback
Viewer
•
Updated
yiran-wang3/None-binarized
Viewer
•
Updated
yiran-wang3/original_cn_mining_sandbox_debug_iter0-full_response_traceback
Viewer
•
Updated
•
2
yiran-wang3/original_cn_mining_sandbox_debug_iter0-binarized
Viewer
•
Updated
•
2
yiran-wang3/original_cn_rl_oj_debug_iter0-full_response_traceback
Viewer
•
Updated
•
2
yiran-wang3/original_cn_rl_oj_debug_iter0-binarized
Viewer
•
Updated
•
2
yiran-wang3/cleaned-mining-deepseek-llm-python-binarized-gt-replace
Viewer
•
Updated
•
24.6k
•
344
yiran-wang3/cleaned-mining-codellama-python-base-all-binarized
Viewer
•
Updated
•
26.6k
•
2
yiran-wang3/cleaned-mining-codellama-instruct-base-all-binarized
Viewer
•
Updated
•
26.6k
•
2
yiran-wang3/cleaned-mining-deepseekcoder67-base-all-binarized
Viewer
•
Updated
•
20.1k
•
2
•
1