Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
Walmart-the-bag 
posted an update May 12
Post
1058
Replete-AI/code_bagel


Make the ultimate coding finetune to compete with the likes of closed source models using the code_bagel dataset!

Made by @rombodawg of RepleteAi, the code_bagel dataset contains over 800 million tokens of deduplicated and uncensored code from only reputable sources on huggingface. This code is formatted in the alpaca instruct format for ease of use in training.
In this post