Kaiwen Wang's picture

1

Kaiwen Wang

kaiwenw

https://kaiwenw.github.io/

AI & ML interests

Reinforcement Learning

Recent Activity

updated a dataset about 17 hours ago

kaiwenw/open_r1_apr9_round1_combined_balanced

published a dataset about 17 hours ago

kaiwenw/open_r1_apr9_round1_combined_balanced

updated a dataset about 18 hours ago

kaiwenw/open_r1_apr9_round1_combined_random

View all activity

Organizations

None yet

Papers 3

arxiv:2407.15762

arxiv:2403.05385

arxiv:2302.03201

models 7

kaiwenw/nov11_oasst_aft_llama_lr_3e-5_rerun

Text Generation • Updated Dec 9, 2024

kaiwenw/nov22_lr_3e-6_lora_32_dropout_0.1_all_reject_first_ep_4

Text Generation • Updated Dec 7, 2024 • 1

kaiwenw/nov22_lr_3e-6_lora_32_dropout_0.1_all_reject_first_ep_3

Text Generation • Updated Dec 7, 2024 • 1

kaiwenw/nov22_lr_3e-6_lora_32_dropout_0.1_all_reject_first_ep_2

Text Generation • Updated Dec 7, 2024 • 1

kaiwenw/nov22_lr_3e-6_lora_32_dropout_0.1_all_reject_first_ep_1

Text Generation • Updated Dec 7, 2024 • 1

kaiwenw/nov2_oasst_aft_llama_lr_3e-5

Text Generation • Updated Nov 8, 2024

kaiwenw/oct31_oasst_llama70b_jft

Text Generation • Updated Nov 6, 2024

datasets 100

kaiwenw/open_r1_apr9_round1_combined_balanced

Viewer • Updated about 17 hours ago • 49.4k

kaiwenw/open_r1_apr9_round1_combined_random

Viewer • Updated about 18 hours ago • 49.4k

kaiwenw/open_r1_apr9_DeepSeek_R1_Distill_Qwen_32B_tokenized

Viewer • Updated 1 day ago • 49.4k

kaiwenw/open_r1_apr9_DeepSeek_R1_Distill_Qwen_14B_tokenized

Viewer • Updated 3 days ago • 49.4k • 17

kaiwenw/open_r1_apr9_DeepSeek_R1_Distill_Qwen_7B_tokenized

Viewer • Updated 3 days ago • 49.4k • 17

kaiwenw/open_r1_apr9_DeepSeek_R1_Distill_Qwen_1.5B_tokenized

Viewer • Updated 4 days ago • 49.4k • 26

kaiwenw/open_r1_apr9

Viewer • Updated 5 days ago • 49.4k • 59

kaiwenw/combine_1.5B_7B_and_32B

Viewer • Updated 10 days ago • 49.5k • 58

kaiwenw/combine_1.5B_and_blockwise

Viewer • Updated 11 days ago • 49.5k • 52

kaiwenw/open_r1_mar2_DeepSeek_R1_Distill_Qwen_1.5B_tokenized

Viewer • Updated 11 days ago • 49.5k • 106