Questions Regarding the Training Dataset

#1
by Ren-Biao-Liu - opened

Thank you for providing this excellent model! I noticed that its name includes "code_contest." Could you please confirm if this model was trained on the "code contest" dataset? I am aware that you previously released a dataset called "code_contest_reasoning." Was this the dataset used for training? Additionally, will this dataset be open-sourced in the future? As I have recently been attempting to train models on this dataset, I was wondering if I could consult you on some specific details. Your assistance would be highly beneficial to me. Looking forward to your reply!

Hey, this model is just for my graduation thesis. I'm just an amateur AI student, so the result model might not be as good as you expect.
Yes, the model is a Phi-3-mini finetuned on my custom dataset to solve problems in code_contest. Each row of the dataset is a pair of (code_contest's problem description, Codeforces editorial). After my thesis finishes and I (hopefully) graduate in around November, the dataset will be open-sourced.

Sign up or log in to comment