Better coding dataset

#1
by rombodawg - opened

If you need a bigger dataset than codealpaca thats formatted in very similar way i have one made and you are free to use it.
link bellow

https://huggingface.co/datasets/rombodawg/MegaCodeTraining112k/tree/main

Jina AI org

you rock @rombodawg will try it out

Let me know if you train a model with my dataset please! Ive been waiting to try that type of model, i just dont have the recourses to train one myself

@bwang0911 @samsja If you guys are interested I have made a version 3 of my megacode dataset and this one is the most promising one yet. Feel free to use to to train your future models:

https://huggingface.co/datasets/rombodawg/LosslessMegaCodeTrainingV3_2.2m_Evol

Sign up or log in to comment